vickenty / lang-c

Lightweight C parser for Rust
Apache License 2.0
202 stars 30 forks source link

Mixed type specifiers #13

Closed tathanhdinh closed 4 years ago

tathanhdinh commented 4 years ago

Hello all,

Currently lang-c accepts the following C code:

struct S {}
int z;

while the code is not valid (lack of semicolon ; after struct declaration). I believe this is because the grammar accepts declaration's specifier as a list:

https://github.com/vickenty/lang-c/blob/a76a36c4bc282b04e7a0de89754b798cb46b158f/grammar.rustpeg#L447-L453

So it includes both struct S {} and int as type specifiers into the list, the invalid code is parsed somehow as:

struct S {} int z;

Similarly, the invalid declaration char int z; is accepted also.

Many thank for any feedback.

vickenty commented 4 years ago

Thank you for opening an issue.

This is the intended behavior. The language syntax as described in the C standard does not prohibit such combinations. Instead, the list of valid specifier combinations is given as a separate table later. Thus I thought it appropriate not to implement this in the parser.

This said, it might be possible to construct a parser to accept only valid combinations, but I think you may find that the number of grammar rules required for this would be too large. It may be better left for the semantic analysis stage to deal with this.

tathanhdinh commented 4 years ago

Thank you for the feedback.

Indeed, the grammar does not prohibit this declaration. As you've said, the semantics analysis should be the correct stage to check this.