mike-lischke / antlr4ng

Next Generation TypeScript runtime for ANTLR4
Other
65 stars 11 forks source link

Reusing grammars from antlr4 #44

Closed thekevinscott closed 3 months ago

thekevinscott commented 4 months ago

I'm new to antlr and trying to use antlr4ng in a Typescript project. I've been able to successfully generate the Expression grammar from the README, as well as a Kotlin grammar (related to antlr4-c3).

I'm now trying to leverage the Python or JavaScript grammars defined in the antlr/grammars-v4 repo (e.g., JavaScript). However, it seems like to make them work requires additional base components, like JavaScriptParserBase.

Are these base components compatible with antlr4ng, or do they need to be converted to work with antlr4ng? I'm seeing errors when I try, but I'm not sure if I'm doing something wrong, or if they're not intended to be compatible, or something else.

mike-lischke commented 4 months ago

They are compatible, but require some work, since a few of the public APIs have different names (e.g. inputStream instead of _input etc. But that just a matter of a few minutes to fix.

thekevinscott commented 4 months ago

Oh, awesome! Thank you @mike-lischke . Would it be helpful for me to set up an isolated repo? Let me know if I can help out in any way.

mike-lischke commented 3 months ago

That depends on how you want to proceed. Is it only for testing how things go a separate repo might be useful. Otherwise just jump in and convert 🙂.

Regarding your offer: what are you aiming for? Creating support classes for these grammars?

kaby76 commented 3 months ago

@thekevinscott I would not fork the grammars-v4 repo for Antlr4ng until there is a clear divergence of the Antlr4ng meta grammar from Antlr4. (I.e., a grammar for JavaScript takes on a completely different syntax for the target.) Forking will create a maintenance issue.

trgen supports generating drivers for any target from grammars in grammars-v4. I have targets for CSharp, Cpp, Dart, Go, Java, JavaScript, PHP, Python3, and TypeScript. I even have an Antlr4cs target for the ancient Antlr4cs forked version of Antlr4. The only sharing per-se is the grammar. But, in reality, each "target" can use different tool chains.

But, more specifically, trgen already implements the "Antlr4ng" target. I haven't yet integrated it into the grammars-v4 Github Actions build., but I was always intending to do so because Antlr4ng is valuable.

However, I am not an "admin" for the repo. "I can lead the horse to water, but I cannot make him drink."

mike-lischke commented 3 months ago

Seems I misunderstood what @thekevinscott wants. To me it read like: "Should I convert my project in place or make a copy?". I wasn't aware the question is about creating a copy of the grammar repo. In this case I completely concur with @kaby76. There's no need to create a new repo for grammars. antlr4ng is just a runtime target like C++, Java, Python etc. They all work with the ANTLR4 grammars.

thekevinscott commented 3 months ago

Thanks for the discussion all, let me add a clarification.

I want to leverage the grammars in https://github.com/antlr/grammars-v4 with antlr4ng. It looks as if, to do so, I need to leverage helper files that are present in that repo, and afaik, these files are written by hand and not generated automatically. I'm talking specifically about these:

These specific implementations extend from antlr4 which doesn't appear directly compatible with antlr4ng.

If I want to use these grammars with antlr4ng, then these helper files need to be rewritten by hand to work with antlr4ng's conventions, correct? I assume it's not that hard to convert, given Mike's comment "a few of the public APIs have different names" above, but I'll need to familiarize myself with what those changes are to do so.

Is this all a valid understanding? If not, then I'm probably misunderstanding something about the ecosystem.

@kaby76 , I appreciate the link to trgen - I'll take a look!

mike-lischke commented 3 months ago

Ah ok. Right these helpers files are tailored towards a specific target. For antlr4ng you can reuse those written for the JS/TS antlr4 runtime, but have to adjust them a bit (different import and maybe a name change here and there). But nothing major to change.

thekevinscott commented 3 months ago

Gotcha. So then the next step for me to be able to leverage these would be to adjust these files to work with antlr4ng's imports / names.

I can take a stab at that. Should I post my inevitable questions here, or is there a better place / I should close this issue?

kaby76 commented 3 months ago

@thekevinscott

javascript/javascript does have a TypeScript port for Antlr 4.13.1. so you'd have to make a PR to the grammar to add a Antlr4ng/ subdirectory, then add the base class files there.

For python/python3, you should use instead python/python3_12_1. That is a new grammar, well maintained. python/python3 will be removed to avoid the confusion of what version "python/python3" is trying to actually support. However, python/python3_12_1 doesn't have an existing TypeScript port (there is no subdirectory TypeScript/), but python/python3/ does here and it does work. (Note, TypeScript testing is removed because there was a problem with the target with base class code. That was a rash decision on my part, and I'll add it back in.) You may be able to copy some of this code for an Antlr4ng port.

So, I would close this Issue and reopen it in under issues for grammars-v4. You can then open some draft PRs to add Antlr4ng ports and see where it goes.

thekevinscott commented 3 months ago

Excellent, thanks both for your support. I'll do that.