antlr / grammars-v4

Grammars written for ANTLR v4; expectation that the grammars are free of actions.
MIT License
10.17k stars 3.7k forks source link

[Ada] Add Ada 2022 grammar #4197

Open ethindp opened 2 months ago

ethindp commented 2 months ago

Has anyone considered adding the grammar for Ada 2022? I'm not positive precisely what all the differences are (there are some new keywords and definitely some syntactic rules). I found this grammar but it doesn't seem to lex/parse properly with even decently-valid Ada programs (which is a bit of a problem).

kaby76 commented 2 months ago

Assuming we write a new grammar, do you have a test suite of some sources that we can add to make sure the new grammar will work?

ethindp commented 2 months ago

@kaby76, of course! All Ada 2012 code is valid ada 2022 code, as is ada 2005 and 95 code, and the Ada reference manual has examples for pretty much all syntactic constructs if I'm not mistaken. So even the old test cases will work.

ethindp commented 2 months ago

@kaby76 Do you want me to send in a PR of test cases? (I've got a massive folder of files of test code, some of which isn't meant to be standalone, and a lot of which is, from RosettaCode, and then there's the ACATS as well.) The ultimate goal would just be to provide a parse tree, not do semantic analysis or anything like that.

kaby76 commented 2 months ago

@kaby76 Do you want me to send in a PR of test cases?

If you can, post here as .zip (or .tar.gz it GH will accept it), or create a PR for a new grammar for Ada2022 with the tests.

ethindp commented 2 months ago

@kaby76 Here you go: tests.tar.gz

This archive is divided into two directories. In alphabetical order:

Looking at the tests FAQ, it doesn't appear at a glance that trgen allows for specifying what is expected to pass and what is expected to fail or things like that. The simplest, I think, is to start with the ACATS, classes A, C and D, and make sure that they at least parse. Feel free to pick and choose what tests you'd like to keep and which ones you don't want. So, I think a good desc.xml might be something like:

<?xml version="1.0" encoding="UTF-8" ?>
<desc xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:noNamespaceSchemaLocation="../_scripts/desc.xsd">
   <targets>Antlr4ng;CSharp;Cpp;Dart;Go;Java;JavaScript;PHP;Python3;TypeScript</targets>
   <inputs>acats/A/*.*</inputs>
   <inputs>acats/B*/*.*</inputs>
   <inputs>acats/C*/*.*</inputs>
   <inputs>acats/D/*.*</inputs>
</desc>

(I don't know if you can specify <inputs> multiple times, so...) I hope this isn't too much! If I can help in anyway, let me know! :)

Edit: It's worth noting that some of the tests in class B (if we include those) might pass. I'm not certain which ones are ones that would fail in semantic analysis and which ones just have syntactic errors. I know this complicates things :-( Like I said though, I'm happy to help in anyway.

kaby76 commented 2 months ago

Thanks so much!