tree-sitter / tree-sitter-c

C grammar for tree-sitter
MIT License
237 stars 106 forks source link

feat: add gnu inline asm syntax #140

Closed benjaminBrownlee closed 1 year ago

benjaminBrownlee commented 1 year ago

Add the GNU inline assembly syntax. Let me know if you think the names of nodes or the tree structure should be different. The structure is a bit more lax than the specification, but I think it makes sense for simplicity.

closes #139

amaanq commented 1 year ago

I have something similar in my Objective-C grammar, is it more or less the same? https://github.com/amaanq/tree-sitter-objc/blob/7938eb5e135574095d8baf2c758e1f69d8cccafa/grammar.js#L372

I also have one for ms's version, __asm in the ms_asm_block

benjaminBrownlee commented 1 year ago

I think they are trying to do the same thing, but there are some issues with the objective-c version:

amaanq commented 1 year ago

I added the optional identifiers after the strings because of this https://github.com/apple-oss-distributions/objc4/blob/c3f002513d195ef564f3c7e9496c2606360e144a/test/release-workaround.m#L26

Is this valid then only in Objective-C?

The other changes are good, I can then easily extend off of that from C which would be nice 🙂

benjaminBrownlee commented 1 year ago

We are talking about different parameters. The asm function has one required parameter (a string) and then up to four optional parameters separated by colons, which are the output operands, input operands, clobber list, and goto labels. Your link shows an example with an output operand but I was talking about the clobber list.

The last line of my test case uses a clobber list (with just "r2"). https://github.com/tree-sitter/tree-sitter-c/blob/a53e6a39d43b043b2c383e43b763e8d278b107aa/test/corpus/expressions.txt#L135-L143

From what I can tell, the asm syntax for obj-c, c and c++ is the same, but may need to be updated to support other features (ie raw string literals).

benjaminBrownlee commented 1 year ago

For reference, the GNU specification for extended inline assmebly.

https://gcc.gnu.org/onlinedocs/gcc/Extended-Asm.html

amaanq commented 1 year ago

We are talking about different parameters. The asm function has one required parameter (a string) and then up to four optional parameters separated by colons, which are the output operands, input operands, clobber list, and goto labels. Your link shows an example with an output operand but I was talking about the clobber list.

The last line of my test case uses a clobber list (with just "r2").

https://github.com/tree-sitter/tree-sitter-c/blob/a53e6a39d43b043b2c383e43b763e8d278b107aa/test/corpus/expressions.txt#L135-L143

From what I can tell, the asm syntax for obj-c, c and c++ is the same, but may need to be updated to support other features (ie raw string literals).

Oh yeah, sorry about that! I appreciate the gnu reference - but I believe other compilers support it as well - can we rename the rules to drop the gnu prefix?

benjaminBrownlee commented 1 year ago

While the main compilers (clang, gcc) all support this syntax, it is a GNU extension and not a language standard. I guess I chose the gnu_ prefix to distinguish it from other assembly syntax which might be added, such as MSVC or ARMCC 4/5, which I would probably prefix with a msvc_ and armcc_ respectively.

You want to drop the prefix altogether and figure out the other syntax names later? Or perhaps there is a prefix you like better?

amaanq commented 1 year ago

If it's a gnu extension then no worries (I wasn't sure about that), this looks good!