Closed ericvergnaud closed 9 months ago
@KvanTTT would you be comfortable looking into this ?
Yes, I'd like to do that at the next weekend or later. But probably I need maintainer role to make possible to do commits into the repository (I'm not sure that GitHub correctly supports non-standard PR that contains merge from unrelated histories and I'd like to keep both histories).
@KvanTTT I've given you 'write' access, let me know if any issue
Please merge to 'dev' branch
Please merge to 'dev' branch
Ok.
I've investigated Strumenta's kotlin runtime a bit and concluded, that it's unlikely I will integrate the full repository with its history because of the following points:
expect/actual
declarations (MPP) functionality that is supposed to work with Gradle build system. Currently ANTLR relies on Maven. Also, supporting such functionalify needs additional configuration effort.So, I think the best way is about using J2K converter on the current codebase with copying some files from Strumenta's repository (mostly files related to code-generations, Kotlin.stg
and related). Also, we can do it file-by-file without breaking build and tests.
Addressing the points.
This repo is completely uncovered by ANTLR runtime unit-tests
We can add all those tests, by replicating the infrastructure.
The same kind of work was done by me for integration tests of some grammars (under the antlr-kotlin-tests
module), that is, I've replicated the antlr4test-maven-plugin
strategy, just using Gradle and the KMP kotlin.Test
functionalities.
It obviously is a task that will take time, and I'm not getting paid to do it, so I can definitely look into it - but I can't dedicate hours everyday.
that are not supposed to be in codebase
They had to be placed somewhere as we have no access to the ANTLR4 repo testing infra.
They're not shipped obviously, but they're nice to have to verify the generated code compiles correctly.
We've found two compiler (K1) bugs doing this.
It uses expect/actual declarations (MPP) functionality that is supposed to work with Gradle build system
That may be interpreted as a blocking issue, but my personal suggestion is to replace the Maven infra with Gradle, or try to call Gradle through Maven. Again it's not something that takes a day to solve, but trying to swipe away the Gradle foundation for Multiplatform seems like a major step back.
So, I think the best way is about using J2K converter on the current codebase with copying some files from Strumenta's repository
This is exactly what @ftomassetti did in the beginning (was it in 2017?).
In subsequent PRs starting December 2023 I've cleaned up the source code, but it took weeks and it's still not completely done. Doing the exact same thing again seems like wasting a lot of time.
It obviously is a task that will take time, and I'm not getting paid to do it, so I can definitely look into it - but I can't dedicate hours everyday.
Thanks for that. BTW, I have the same situation.
They had to be placed somewhere as we have no access to the ANTLR4 repo testing infra.
We can set up testing on the whole grammars-v4 repository instead of testing on several grammars. But we can do it later.
We've found two compiler (K1) bugs doing this.
Yes, it's possible. I can't say if we fixed them in K2 because, unfortunately, some behavior remains "buggy" in favor of back-compatibility. But I can take look at them if you share.
That may be interpreted as a blocking issue, but my personal suggestion is to replace the Maven infra with Gradle, or try to call Gradle through Maven.
We definitely can do it but later. But I prefer getting rid of incorporating several big changes into one big step. Instead, I suggest porting Kotlin runtime at first and making sure that all tests pass. Later we can switch to Gradle if it's needed.
This is exactly what @ftomassetti did in the beginning (was it in 2017?). In subsequent PRs starting December 2023 I've cleaned up the source code, but it took weeks and it's still not completely done. Doing the exact same thing again seems like wasting a lot of time.
Makes sense. But I suggest using the official repo codebase as a foundation, but not vise verse.
Thank you @KvanTTT for investigating this. Is great to have someone with your expertise to look into this
I've investigated Strumenta's kotlin runtime a bit and concluded, that it's unlikely I will integrate the full repository with its history because of the following points:
- This repo is completely uncovered by ANTLR runtime unit-tests (descriptors). It's unclear how many tests are really broken if this repository is ported.
Would it be difficult to check if there are broken tests and how many? How much effort do you think it would take, not to fix the tests but just to check how bad the situation would be?
From the other side, it contains kind of integrational tests on big grammars that are not supposed to be in codebase.
I think removing them after merging/importing/copying would be easy and I can do that
- It uses
expect/actual
declarations (MPP) functionality that is supposed to work with Gradle build system. Currently ANTLR relies on Maven. Also, supporting such functionalify needs additional configuration effort.
Now, that is the part that concern me the most. My understanding is that expect and actual are the only mechanism you can use to write platform specific code and accomodate for the very few specificities of the different platforms. In the whole ANTLR Kotlin project there are very few classes that use that mechanism. Exclusively the ones where there were platform specifities we needed to account for. For example, the Console works differently on Java and JS
I only used Kotlin with Gradle so far, so I did not know that such feature was connected to the build system. If I understand correctly you are saying that a) expect/actual work only when compiling with gradle and not when compiling with maven b) ANTLR relies on Maven so we should make compilation work with Maven. If my understanding is correct, then this means you will need to write code that can be compiled on all platforms with no possibilities to deal with the very limited but present quirks of the specific platforms. But I am not sure if that is possible. How would you handle that?
- IntelliJ's Java to Kotlin converter works quite good. It makes some mistakes which need further correction but not so many.
So, I think the best way is about using J2K converter on the current codebase with copying some files from Strumenta's repository (mostly files related to code-generations,
Kotlin.stg
and related). Also, we can do it file-by-file without breaking build and tests.
Yes, the J2K converted works great, otherwise the initial version of ANTLR Kotlin would have taken years to be developed :D It still requires some manual fixes and refactor all the usages of the JVM specific classes to support the multiple platforms. On this point I would agree with @lppedd that you may end up redoing some of the work already done (and verified over the years). @lppedd put quite some effort in polishing up the code (besides other contributions, like updating to the latest ANTLR version).
But I understand that a gradual approach may be more convenient. Perhaps you can copy the files from the ANTLR Kotlin project as needed, doing that incrementally and keeping the build running, at least until you compile only to the JVM.
It obviously is a task that will take time, and I'm not getting paid to do it, so I can definitely look into it - but I can't dedicate hours everyday.
Thanks for that. BTW, I have the same situation.
I think we are on the same boat on this, for this reason I think that whatever approach save us the most work is the only sustainable. You obviously know the ANTLR codebase and the Kotlin ecosystem better of anyone so you are in the best position to assess which approach will get us moving while requiring an effort we can sustain
Thanks for this, very interesting details. I guess my expectation was that we would simply bring in the tool target, the runtime itself, and fill the gap for the antlr test framework to ensure that the runtime behaves the same as the reference java target. The grammars-v4 repo is an extremely useful source of grammars, but it's not as accurate as the antlr test suite, and does not actually test anything, so I would not recommend going that route. The Kotlin-specific tests from the Kotlin repo are an important topic too, but if the Kotlin runtime doesn't even pass the antlr test, then maybe there isn't much point investing time now in having them run with Maven ? Since we'll need to switch to Gradle in order to generate WebAssembly, maybe that can be dealt with in a 2nd phase ? Personally, I'd rather lose the Git history and the Kotlin-specific tests than redo the J2K migration, but maybe I'm missing something ?
I think all depends on expectations, and on Kotlin's integration with Maven.
Multiplatform is Gradle only and I don't think that will change anytime soon being how complex the Kotlin Gradle Plugin is.
So my questions are:
Do you want to generate WebAssembly exclusively?
If the answer is yes, then probably you'll manage to do that with Maven, using the plugin's js
goal and compiler options such as -Xwasm
and -Xwasm-target=wasm-js/wasm-wasi
.
On a personal note, ditching the Multiplatform capabilities seems like the wrong direction, especially if it's taken only to avoid integrating Gradle.
I'll also add that the IDE usability should be taken into account. Will it work correctly with that Maven setup?
Are you open to restructure the (mono)repository?
Being that this repo is a fork of antlr4, there might be a lot of stuff that antlr5 doesn't actually need.
For my little experience what happens with large repositories is once they're actively maintained they just don't scale back: be it because there is a risk of breakage, or just because the once-probably-unnecessary components are in use.
So while it's always cool to jump into coding and getting a tangible result out, I'd start by identifying what can be removed, and remove as much as possible. At that point one can assess what each piece does, and if it's easy or not to switch to Gradle as the foundational build tool.
But I suggest using the official repo codebase as a foundation
To answer this point, the upgrade to ANTLR 4.13.1 was done manually. I did match, file by file and line by line, the Java runtime, with the obvious syntactical and type nullability differences, which were also taken into account by looking at the Swift runtime from time to time.
So to conclude, I'd check the repository design first, and then once there is an answer to what can or cannot be done (be it because it's not doable at all, or it's too costly in terms of time), I'd start integrating the Kotlin side by merging and deleting what is found to be redundant/unnecessary.
At that point Strumenta's repo might already have all the required runtime tests in place.
To expand on the questions, I'd add another couple just to better understand the vision.
there is an opportunity to have just 1 runtime, that will run faster with language hosts such as JavaScript or Python. ANTLR 5 is primarily about that: switching to WebAssembly
That implies languages that will want to run the built WASM artifact will have a standard or third party runtime, right? And also that Kotlin will be able to correctly export the API for those runtimes.
and the 1st version of ANTLR5 will only support TypeScript
What does this mean in relation to Kotlin?
It means we're using Kotlin to generate WebAssembly. As of writing, GraalVM doesn't yet support WasmGC so we won't be able to run the generated runtime in Kotlin/JVM, rather the only available runtime will be TypeScript, to be run on platforms that support WasmGC (Chrome, Deno, Node coming soon). We will expand gradually to more host languages as their support for WasmGC becomes available.
Thanks! That clarifies it.
So it's not really about TypeScript per-se, but more about the platforms that run TypeScript (and JavaScript).
I think that while the ecosystem is progressing towards WASM, having our Kotlin codebase compile to other targets would be beneficial. We don't really know how fast the other hosts will get support for WASM, and it's also a big bet in terms of adoption. WASM is the new cool kid in town, and it's going to be like that for quite some time.
If the codebase can grow without having a specific target in mind, it allows maintainers to pursue other strategies in the future, if required.
Sure, but we already have antlr4 for that, we really want antlr5 to focus on wasm.
Reasonable.
There is another point I'd like to clarify: currently WASM debugging is not supported in IntelliJ/Fleet, so how are you going to do it? And, will debug work correctly with Maven, in general?
Would it be difficult to check if there are broken tests and how many? How much effort do you think it would take, not to fix the tests but just to check how bad the situation would be?
I don't know how to check it without full integration of runtime.
If my understanding is correct, then this means you will need to write code that can be compiled on all platforms with no possibilities to deal with the very limited but present quirks of the specific platforms. But I am not sure if that is possible. How would you handle that?
I see target-specific code in your repository, fortunately, is not so big. I was thinking about porting runtime without expect/actual
(single module) and integrating them later.
Also, since Kotlin K2, expect
and actual
declarations should be declared in different modules.
Sorry, probably I was a little hasty with conclusions. Currently I'm working on repositories merging and I'm trying to reach compilation and tests running.
I'll demonstate the result later, stay tuned.
Reasonable.
There is another point I'd like to clarify: currently WASM debugging is not supported in IntelliJ/Fleet, so how are you going to do it? And, will debug work correctly with Maven, in general?
WebAssembly debugging is supported in Chrome, and in Node and Deno (using WebStorm), provided that the compiler generates map files. Is that not the case?
Also, since Kotlin K2, expect and actual declarations should be declared in different modules.
Is there a YT issue for this? Seems like a "devastating" breaking change to me.
WebAssembly debugging is supported in Chrome, and in Node and Deno
Ok this sounds good enough, although I was mostly referring to in-IDE debugging capatibilites to diagnose issues or debug test cases without switching context.
In my multiplatform applications/libraries I tend to debug using the faster targets, which is most often the JVM, and sometimes JS, since I consider a guarantee from the Kotlin language that common definitions will have the same behavior on all platforms.
Also, I'm not sure WebStorm supports the Kotlin plugin, since you mentioned it. Last I heard it wasn't planned from JB.
So you'd have an half-working environment.
@lppedd Let me explain how this works:
Runtime production and parser generation:
Parser execution:
Since we generate library-like code, there is no scenario where we need the ability to debug using the Kotlin plugin, apart maybe from unit testing. Users won't be writing Kotlin code, so they don't need the plugin.
The only critical requirement is to have generated map files for the Kotlin code that we have written or that the tool has generated.
(that said, it would be great if IntelliJ had the ability to debug WebAssembly running in Node or Deno)
Thanks!
apart maybe from unit testing
Yeah this is probably the part I find myself using a lot. I step through the code often while developing it, but if you think it's not going to be a common occurence, then it's something we don't have to think about (not now at least).
There are two issues to follow.
https://youtrack.jetbrains.com/issue/KTIJ-28387
https://youtrack.jetbrains.com/issue/KTIJ-28388
Is there a YT issue for this? Seems like a "devastating" breaking change to me.
To me it's quite logic change since expect/actual
concern is releavant only to multi-modules environment, see https://youtrack.jetbrains.com/issue/KT-55177
Ahhhh I recall seeing it.
But it shouldn't impact the current antlr-kotlin repository, as actualizations are one per compilation. I've compiled with K2 without issues yesterday.
Please take a look at my request with the current status: https://github.com/antlr/antlr5/pull/6 Also, you can try run it locally (configured kotlinc
is required for tests).
I hope you guys don't mind me asking here: For someone who's interested in generating Kotlin code from ANTLR grammar in the short term, would the recommendation be to use https://github.com/Strumenta/antlr-kotlin currently, or wait until https://github.com/antlr/antlr5/pull/6 is merged? Probably related question: What's the roadmap for ANTLR 5, I could not find any information about it. And is the release of the K2 compiler / Kotlin 2.0 somehow a requirement for Kotlin code generation?
I hope you guys don't mind me asking here:
Sure, don't hesitate.
For someone who's interested in generating Kotlin code from ANTLR grammar in the short term, would the recommendation be to use https://github.com/Strumenta/antlr-kotlin currently, or wait until https://github.com/antlr/antlr5/pull/6 is merged?
I think it's possible to make this (ANTLR 5) runtime production-ready in short term, but currently it has only JVM-target. Other targets (Native, WA) will be implemented later because it requires writing some expect/actual
functions and probably migrating to Gradle build system.
What's the roadmap for ANTLR 5, I could find any information about it.
The current version doesn't have any breaking change, but later they probably will be introduced since there are a lot of things to improve. Generally I like the idea of using this runtime ASAP. So, maybe it makes sense to release an intermediate Kotlin-runtime version without breaking changes ASAP in a separated branch and continue working on futher improvements in the main branch.
I could find any information about it. And is the release of the K2 compiler / Kotlin 2.0 somehow a requirement for Kotlin code generation?
Now it requires K2 only for code-generation. But it's possible to use generated code with a lower version (K1).
Let me add my 2p. Although we definitely want to have a working antlr5 asap, there is still a lot to do before we can release a version that actually adds something to antlr4 or to the Strumenta fork. And we don't yet know if that version will be WASM only or also JVM... You are more than welcome to play with antlr5 if you're comfortable with frequent breaking changes. If OTOH you are looking for a production version on the short term, you're much better going with the Strumenta fork.
Thanks for the comments. I'm not necessarily looking for a production ready solution as I'm considering to only do a one-time conversion of the ANTLR grammar to Kotlin code as the grammar barely changes. So I'm mostly interested in the best generated code I can currently get for Kotlin, and I'm totally fine to massage that code manually after the conversion.
Just to avoid any confusion for people looking at this for the first time: I would not call ANTLR Kotlin a fork but simply a target for ANTLR4. It does not change or replace anything in the original project and it instead relies on ANTLR4. Regarding the maturity of the ANTLR Kotlin target, I personally do not use it in production but several companies (behind several contributors) do, so it has some level of stability/maturity
I'm considering to only do a one-time conversion of the ANTLR grammar to Kotlin code
Which brings me to a related quesrtion: Is there currently a stand-alone tool available for this, ideally even a hosted web tool that doesn't require any installation? I.e. I paste ANTLR grammar into a form, choose the target language, press a button, and get the generate code for download in a ZIP or so?
@sschuberth you need a runtime tho. So there is not much point in doing half the work through an online tool, if you have to set up the runtime dependency anyway.
you need a runtime tho.
Which runtime? ANTLR runtime, or Kotlin runtime? If the former, then I probably was under the wrong assumption that the generated code would be self-contained.
The point is not really the generated code, but the ANTLR runtime capabilities.
The antlr-kotlin repository hosts a true multiplatform project, which is what a Kotlin consumer will want to depend on for his own multiplatform project.
you need a runtime tho.
Which runtime? ANTLR runtime, or Kotlin runtime? If the former, then I probably was under the wrong assumption that the generated code would be self-contained.
The generated code is not self-contained. It relies on a target-specific runtime.
I'm considering to only do a one-time conversion of the ANTLR grammar to Kotlin code
Which brings me to a related quesrtion: Is there currently a stand-alone tool available for this, ideally even a hosted web tool that doesn't require any installation? I.e. I paste ANTLR grammar into a form, choose the target language, press a button, and get the generate code for download in a ZIP or so?
antlr5 is precisely about this i.e. web first, bit it's very early days....
Which brings me to a related quesrtion: Is there currently a stand-alone tool available for this, ideally even a hosted web tool that doesn't require any installation? I.e. I paste ANTLR grammar into a form, choose the target language, press a button, and get the generate code for download in a ZIP or so?
There isn't a requirement specification document for Antlr5. So, in lieu of that, and to not loose the issue in this closed thread, you should probably add an issue somewhere (maybe antlr5-lab, or antlr5, or antlr5-specs) that describes your specific use case. It should state that you want to input a grammar via a website, click a "generate" button, and downloading a zip of the generated recognizer.
We will be using Kotlin to generate the WebAssembly runtime. There is a need to integrate it from Strumenta's repo. There is also a need to ensure it supports the same level of testing as the existing Java runtime. (we don't want to embark on WebAssembly without checking that the runtime is 100% compatible)