kaby76 / scrape-c-plus-plus-spec

A program to scrape the grammar from Annex A of one of the ISO148822 specs
MIT License
3 stars 0 forks source link

opaque_enum_declaration doesn't look right #3

Open DonMathi opened 3 years ago

DonMathi commented 3 years ago

Hi Kaby76

I'm looking into some details about your g2/Cpp14Parser.g4 file, and it seems to me that opaque_enum_declaration : enum_key attribute_specifier_seq ? | Identifier enum_base ? | Semi ; doesn't look right. To me there are too many | in here. The pdf look like this:

opaque-enum-declaration:
enum-key attribute-specifier-seqopt identifier enum-baseopt;

What am I missing?

kaby76 commented 3 years ago

You are correct. The rule shouldn't have the alt-operator. And there are other errors. I am still refining the scaper and refactorings for the c++14 grammar. The new grammar is in https://github.com/kaby76/scrape-c-plus-plus-spec/blob/main/CPlusPlus14Parser.g4 but it is not compiling yet. The original scraped grammars are in https://github.com/kaby76/scrape-c-plus-plus-spec/tree/main/scraper and those should look exactly as what you see in the specs. I have checked those multiple times so I think they are correct. But they are not functioning grammars. The script https://github.com/kaby76/scrape-c-plus-plus-spec/blob/main/trash.sh takes the c++14.g4 grammar and tries to produce a working grammar but it is not finished.

DonMathi commented 3 years ago

I am also looking into your CPlusPlus14Parser.g4 file, and there are some problems with the use of attribute_specifier_seq ? and the rule attribute_specifier_seq : attribute_specifier* ; You can now optionally (?) recognize an empty string (). But if you change the `to an+`, then it doesn't complain. There are some more similar places where this is the issue

DonMathi commented 3 years ago

The last thing that ANTLR is complaining about is the use of a fragment in the parser grammer. I'm not sure how to fix that. preprocessing_token : FHeader_name | Identifier | pp_number | Character_literal | User_defined_character_literal | String_literal | User_defined_string_literal | preprocessing_op_or_punc | ~Newline ;

fragment FHeader_name : '<' FH_char_sequence '>' | '"' FQ_char_sequence '"' ;

kaby76 commented 3 years ago

Thank you for the info. I'll update trkleene to do the right thing for attribute_specifier_seq.

The plan is to continue to make changes to the Trash tool set. However, the main focus is still just the C++ Spec scraper, which reads one of the several dozen C++ Spec pdfs and outputs a non-functioning, but acceptable, Antlr syntax grammar as it is in Appendix A. I purchased the three official ISO specs a couple of weeks ago. And, they are very different from the drafts and each other. The requirement for the scraper is to produce an identical grammar in Antlr syntax to that of a Spec. Trash is used to bring that grammar into a functioning, optimized Antlr grammar.

The grammars that are being outputted by the Trash refactoring script do not compile yet and are mainly produced to find errors in scraping the Spec

I have checked the c++14.g4 several times against the Spec visually, but I will need to do that a couple more times to add in more code to the scraper to add another layer to the code to make small differences in the spec self-correcting. I haven't checked by eye whether c++17.g4 and c++20.g4 are correct.

I plan to add a large (~25k files) test suite of Clang, Gnu, Windows C++ source code to test the parser.

So, I am still very far away from completion.

DonMathi commented 3 years ago

Great work you are doing

kaby76 commented 3 years ago

Thank you. Much appreciated! And, thanks for looking at this even if it's unstable still. I'll let you know here when things are a little more stable with the grammar. --Ken

On 11/22/2021 2:43 AM, Kjeld Mathias Petersen wrote:

Great work you are doing

— You are receiving this because you commented. Reply to this email directly, view it on GitHub https://github.com/kaby76/scrape-c-plus-plus-spec/issues/3#issuecomment-975210060, or unsubscribe https://github.com/notifications/unsubscribe-auth/ACW4UC5XYSMCVSHHYRPXR2DUNHYB7ANCNFSM5ILR5YKA. Triage notifications on the go with GitHub Mobile for iOS https://apps.apple.com/app/apple-store/id1477376905?ct=notification-email&mt=8&pt=524675 or Android https://play.google.com/store/apps/details?id=com.github.android&referrer=utm_campaign%3Dnotification-email%26utm_medium%3Demail%26utm_source%3Dgithub.