vkazanov / tree-sitter-quakec

A Tree-sitter grammar for QuakeC
MIT License
3 stars 0 forks source link

`voidaaa()` incorrectly parsed as C-style function #1

Open krizej opened 2 months ago

krizej commented 2 months ago

source:

voidaaa()
{

}

tree-sitter parse output:

(source_file [0, 0] - [4, 0]
  (function_definition [0, 0] - [3, 1]
    result: (simple_type [0, 0] - [0, 4])
    name: (identifier [0, 4] - [0, 7])
    parameters: (parameter_list [0, 7] - [0, 9])
    body: (compound_statement [1, 0] - [3, 1])))
vkazanov commented 2 months ago

Thanks for your comment!

What do you think this should parsed like? should it be parsed at all?

I haven't touched QuakeC in some time but as far as I remember there are 2 kinds of function definition/declaration formats:

  1. c style
void() QCStyleFun;
  1. quakec style

void CStyleFun ();

The example you provide doesn't look like the above. Should the parser fail in this case?

krizej commented 2 months ago

you swapped them around :-) qc-style is void() func. I think it should parse voidaaa as the type and then fail because of the missing function name, although I don't know much about tree-sitter grammars so I don't know how practical it is to allow for any name as the type

krizej commented 1 month ago

seems to be an issue with anything which requires a "fixed" string:

I guess adding a whitespace requirement where the comment is would fix it but it seems a bit hacky to me. Though again, I don't know what is considered good practice here

function_definition: $ => seq(
    field('result', $.simple_type),
    choice(
        // c-style  <- typo :-)
        seq(field('parameters', $.parameter_list), field('name', $.identifier)),
        // qc-style <- typo :-)
        seq(/* needs whitespace here */ field('name', $.identifier), field('parameters', $.parameter_list), ),
    ),
vkazanov commented 1 month ago

@krizej thanks a lot! I think I know what's broken there.

Btw, out of curiousity, what do you want to use it for? This was mostly developed for quakec-ts-mode , which is an Emacs mode supposed to replace quakec-mode.

krizej commented 1 month ago

trying to make a LSP for QuakeC

vkazanov commented 1 month ago

Just FYI: I am looking into this and trying to understand why the parser doesn't think there should be a whitespace between a simple_type and a name.

krizej commented 1 month ago

hi, i might have found something useful to solving this: https://tree-sitter.github.io/tree-sitter/creating-parsers#keywords

vkazanov commented 1 month ago

@krizej yes, found that as well. I'll add some more tests for this case and tweak the grammar later this week.

Or you can provide a PR :-)

krizej commented 1 month ago

i would, but i don't know what to commit :-) tree-sitter generate adds an additional bunch of files, updates package.json etc. i commited everything to my repo if you want to take a look at it, I only changed grammar.js

vkazanov commented 1 month ago

What I do for now is only commit root grammar.js and tests, unless it's a release when I update derived src/grammar.json and src/parser.c files.

Anyway, there's a partial fix (turning $.identifier into a word) in the repo.

Pretty sure there's more trouble here.

vkazanov commented 1 month ago

One problem I found:

#define TEST
#defineTEST 

Both are parsed, and the recent change didn't fix it.

krizej commented 1 month ago

yes, i found that issue too and fixed it by putting "define" into a token.immediate which is following a "#", just like in modelgen_pragma

vkazanov commented 1 month ago

Hm. Not sure if that's he way to go. I don't see anything like this in the standard C grammar (which this one is inspired by):

https://github.com/tree-sitter/tree-sitter-c/blob/master/grammar.js#L134

For one thing, this complicates highlighting.scm a bit.

krizej commented 1 month ago

oh, true, I forgot you can put spaces/tabs in between the # and define. I guess the simplest solution is borrowing (heh) the preprocessor function from the C grammar? In their highlights.scm they are still using '#define', so no complications there it seems