eliben / pycparser

:snake: Complete C99 parser in pure Python
Other
3.21k stars 612 forks source link

Add end of token coord #527

Open 99991 opened 7 months ago

99991 commented 7 months ago

It would be nice to know where a node ends. For the start of a token, we have node.coord. Similarly, it would be nice if we also had node.endcoord. The reason why I want this feature is that it would enable precisely underlined error messages like GCC has:

main.c: In function ‘main’:
main.c:2:13: warning: initialization of ‘int’ from ‘char *’ makes integer from pointer without a cast [-Wint-conversion]
    2 |     int x = "asdf";
      |             ^~~~~~

Here is where the coord is parameter is retrieved from yacc:

https://github.com/eliben/pycparser/blob/f7409953060f1f4d0f8988f1e131a49f84c95eba/pycparser/plyparser.py#L55

yacc defines e.g. lineno here:

https://github.com/eliben/pycparser/blob/f7409953060f1f4d0f8988f1e131a49f84c95eba/pycparser/ply/yacc.py#L258

I am not sure if the end position can be obtained from yacc. In case this is not possible, here is a hacky idea: Find the end position by scanning backwards from the next token until a non-whitespace character is found.

I do not have the time to implement this right now, but I thought I should start a discussion here in case someone has a better idea.