tree-sitter / tree-sitter-c

C grammar for tree-sitter
MIT License
237 stars 106 forks source link

allow identifiers in string concatentations #145

Closed clason closed 1 year ago

clason commented 1 year ago

Resolves errors in strings of the form:

("failed in line %" PRIdLINENR)

clason commented 1 year ago

cherry-picked from https://github.com/nvim-treesitter/tree-sitter-c/pull/2 @lewis6991 @amaanq

amaanq commented 1 year ago

@ahlinc would like your thoughts too

ahlinc commented 1 year ago

Screenshot from 2023-07-15 14-40-02

@ahlinc would like your thoughts too

I'd also alias the identifier node to the same node that represent a string content between double quotes but unfortunately there is no one.

It make sense to add a bellow additional fix on top of this PR:

diff --git i/grammar.js w/grammar.js
index 94e7f3d..b51cacd 100644
--- i/grammar.js
+++ w/grammar.js
@@ -1055,13 +1055,16 @@ module.exports = grammar({

     concatenated_string: $ => seq(
       $.string_literal,
-      repeat1(choice($.string_literal, $.identifier)),
+      repeat1(choice(
+        $.string_literal,
+        alias($.identifier, $.string),
+      )),
     ),

     string_literal: $ => seq(
       choice('L"', 'u"', 'U"', 'u8"', '"'),
       repeat(choice(
-        token.immediate(prec(1, /[^\\"\n]+/)),
+        alias(token.immediate(prec(1, /[^\\"\n]+/)), $.string),
         $.escape_sequence,
       )),
       '"',
amaanq commented 1 year ago

Maybe string_content is better than string, but I don't think identifiers should be aliased to string in concatenations

amaanq commented 1 year ago

Also gotta get rid of the appveyor thing after this

@clason tests need updates

clason commented 1 year ago

Used tree-sitter test -u; hope that's OK!

amaanq commented 1 year ago

Mentioned this earlier here, but the formatting of the header/test delimiters needs to not happen, it pollutes the diff (not your problem, it looks fine to me 😅)

clason commented 1 year ago

Would it be acceptable for me to do a separate commit before the change to reformat the corpus? I think the convenience of using tree-sitter test -u is worth a single reformat churn.

amaanq commented 1 year ago

hmm not sure what you mean, like a commit of additional tests?

clason commented 1 year ago

like a commit just calling tree-sitter test -u on the old parser to reformat everything (without additional changes), so that the next commit with the changes can use tree-sitter test -u to only update the now failing tests, without polluting the diff.

amaanq commented 1 year ago

Oooh, yeah I'm way too tired my bad for not getting that - that would be nicer for the future

ahlinc commented 1 year ago

but I don't think identifiers should be aliased to string in concatenations

@amaanq in concatenations identifiers aren't identifiers because they are concatenated literally and not by value.

amaanq commented 1 year ago

well it's like an identifier that holds a string, so it shouldn't be aliased as such so users can distinguish them imo

ahlinc commented 1 year ago

well it's like an identifier that holds a string, so it shouldn't be aliased as such so users can distinguish them imo

sorry, my bad, bash in my mind.

For C the PRIdLINENR would be an identifier that would be substituted by a macro pre proccessor to an another string.

ahlinc commented 1 year ago

Hmm, for the clarity may be it would still make sense to alias identifier to the preproc_arg for example, because it's not real C variable and in this place it's clear how it can appear in concatenation.

Screenshot from 2023-07-15 23-16-19

:cry: Preproc directives parsing definitely requires improvements.

amaanq commented 1 year ago

Preproc directives parsing definitely requires improvements.

Agreed..