antlr / antlr4

ANTLR (ANother Tool for Language Recognition) is a powerful parser generator for reading, processing, executing, or translating structured text or binary files.
http://antlr.org
BSD 3-Clause "New" or "Revised" License
17.21k stars 3.29k forks source link

Can't import from TypeScript runtime with `"moduleResolution": "nodenext"` #4218

Open jedwards1211 opened 1 year ago

jedwards1211 commented 1 year ago

I'm honestly not sure what needs to change in the package.json because I thought "type": "module", "main" and "types" would be enough, but apparently not. The ESM situation is a mess. When i turn on "moduleResolution": "nodenext", I get errors like

test.ts:1:22 - error TS2305: Module '"antlr4"' has no exported member 'CommonTokenStream'.

1 import { CharStream, CommonTokenStream }  from 'antlr4';

It works with "moduleResolution": "node", but according to TS docs that's really designed for CJS and doesn't support packages with "export" maps, so I wouldn't be able to use antlr4 and packages with export maps in the same project.

iSuslov commented 1 year ago

Probably related https://github.com/antlr/grammars-v4/issues/3314

ericvergnaud commented 1 year ago

Hi, agreed that js/ts packaging world is a mess.

tsconfig.json: module = esnext moduleResolution = node

package.json type = module

this builds an esm library that is then successfully used by another esm ts module

jedwards1211 commented 1 year ago

I filed an issue with TS asking how a package should be configured so that it can be consumed by a project using nodenext...hopefully they'll have a good answer, because I don't see any clear guidance in the docs.

jedwards1211 commented 1 year ago

Okay I think the .d.ts should have explicit file extensions in import/export from declarations. I enabled "traceResolution": true and saw this in the tsc output:

======== Resolving module './CharStreams' from '/Users/andy/antlr/node_modules/antlr4/src/antlr4/index.d.ts'. ========
Explicitly specified module resolution kind: 'NodeNext'.
Resolving in ESM mode with conditions 'import', 'types', 'node'.
Loading module as file / folder, candidate module location '/Users/andy/antlr/node_modules/antlr4/src/antlr4/CharStreams', target file types: TypeScript, JavaScript, Declaration.
Directory '/Users/andy/antlr/node_modules/antlr4/src/antlr4/CharStreams' does not exist, skipping all lookups in it.
======== Module name './CharStreams' was not resolved. ========
jedwards1211 commented 1 year ago

Oh and also if tsc --init hadn't set "skipLibCheck": true I would have seen this error:

node_modules/antlr4/src/antlr4/index.d.ts:19:15 - error TS2834: Relative import paths need explicit file extensions in EcmaScript imports when '--moduleResolution' is 'node16' or 'nodenext'. Consider adding an extension to the import path.

19 export * from './error';
                 ~~~~~~~~~
matthew-dean commented 1 year ago

I had a ton of issues with importing Antlr, as it's published as an ESM module. For Mocha tests, it failed entirely, including with any options that supposedly support ESM. I finally had to switch to Jest, and transpile the published package before it was consumable. Publishing with "type": "module" breaks a lot of tools in the Node ecosystem.

ericvergnaud commented 1 year ago

Hi, see PR #4217.

I use Mocha with ts-node, and it works perfectly: --loader=ts-node/esm --require ts-node/register

matthew-dean commented 1 year ago

@ericvergnaud Yeah, I tried that, I tried a billion different ways, but it sounds like 4.12.1 will solve it regardless.

danielzhe commented 1 year ago

Can you show me an example about how did you managed to run jest with ANTLR4, @matthew-dean ?

matthew-dean commented 1 year ago

@danielzhe The Antlr4 distribution of TypeScript is fundamentally broken. The honest answer is that I went back to using Chevrotain.

jonaskello commented 1 year ago

From my experiments with this the problem is here:

https://github.com/antlr/antlr4/blob/dev/runtime/JavaScript/src/antlr4/index.d.ts

This code

export * from "./InputStream";
export * from "./FileStream";
export * from "./CharStream";
export * from "./CharStreams";
export * from "./TokenStream";
export * from "./BufferedTokenStream";
export * from "./CommonToken";
export * from "./CommonTokenStream";
export * from "./Recognizer";
export * from "./Lexer";
export * from "./Parser";
export * from './Token';
export * from "./atn";
export * from "./dfa";
export * from "./context";
export * from './misc';
export * from './tree';
export * from './state';
export * from './error';
export * from './utils';
export * from './TokenStreamRewriter';

needs to be changed to full explicit file paths to adhere to the new rules for packages that have {"type": "module"} set in package.json as the antlr4 package has. So it should be:

export * from "./InputStream.js";
export * from "./FileStream.js";
export * from "./CharStream.js";
export * from "./CharStreams.js";
export * from "./TokenStream.js";
export * from "./BufferedTokenStream.js";
export * from "./CommonToken.js";
export * from "./CommonTokenStream.js";
export * from "./Recognizer.js";
export * from "./Lexer.js";
export * from "./Parser.js";
export * from './Token.js';
export * from "./atn/index.js";
export * from "./dfa/index.js";
export * from "./context/index.js";
export * from './misc/index.js';
export * from './tree/index.js';
export * from './state/index.js';
export * from './error/index.js';
export * from './utils/index.js';
export * from './TokenStreamRewriter.js';

And then for each of the above that imports index.js the same changes need to happen in those index.d.ts files.

Note that, typescript should point to .js rather than .ts files. This is correct and there is a lot of discussion on this on the typescript repo.

jonaskello commented 1 year ago

Actually there are many places where .js needs to be added. It seems the antlr4 package is written in javascript and then the typescript types are manually handled by hand-written .d.ts files? In that case I could do a PR to update them all. Would that be accepted?

seb314 commented 9 months ago

based on https://www.typescriptlang.org/docs/handbook/modules/theory.html, I stumbled on https://arethetypeswrong.github.io/?p=antlr4%404.13.1-patch-1 which seems to be a promising tool to analyze this issue.

I've a patch that satisfies the attw checker in all except the "bundler" categories and the looks plausible to me based on the theory from the docs... should I create a PR? (I'm not a ts expert, so I'm not sure if my approach is the most correct one)

ericvergnaud commented 9 months ago

tbh I suspect that it might be impossible to satisfy all scenarios given the mess created over time by ts vs js and esm vs common js. Basically, we need:

jonaskello commented 9 months ago

If I have understood correctly the current approach is hand writing the JS and then putting the types on from the outside and trying to have this cover every possible target. I think this makes it really hard to handle.

I would recommend writing the code in typescript and then compiling multiple times for the different targets (using different tsconfig settings for module target commonjs, esm etc).

Languages that have a binary target often do cross compile for different CPU architectures (x86, x64, arm). You can think of the JS output from a typescript compile as output for different targets (commonjs, esm, browser).

You can then distribute multiple sets of files and have the package.json specify the location of each set. See these docs for how that works: https://nodejs.org/api/packages.html#conditional-exports

ericvergnaud commented 8 months ago

We are already "compiling" (with webpack) for different targets, so not sure migrating to TS would improve things.

Re conditional exports, we have them in our current package.json, but they follow a different style than the style in the provided link. Comments are welcome.

But what would really help is a test infrastructure for trying out all these targets. If someone is willing to create that, it would be great, because then we can make decisions based on facts.

jonaskello commented 8 months ago

We are already "compiling" (with webpack) for different targets, so not sure migrating to TS would improve things.

I can think of a couple of things here:

I'm not saying it is not possible to handwrite the types separate from the JS and fiddle with a bundler (eg. webpack) to get it transformed to different targets. I'm just saying IMO it is a lot more work and tool churn to do it that way.

Btw, ESM is natively supported in node since version 12 so if there are no special reasons to support earlier versions of node it would be possible to drop the commonjs distribution.

ericvergnaud commented 8 months ago

Thanks. I use the proposed approach in another project. But the JS runtime was written 12 years ago, long before ES6 (classes), ESM or TypeScript. There’s a lot of code out there relying on now legacy stuff, and we can’t break backwards compatibility.

jonaskello commented 8 months ago

Not sure what you are referring to but using typescript would not break any combability with any JS runtimes. If we turn it around, what are the arguments to not simply write this package in typescript?

ericvergnaud commented 8 months ago

The exports wouldn't be backwards compatible. It requires a very significant effort which introduces risk when we have a very stable runtime. Currently our efforts are geared towards antlr5 which will be TS only.

jonaskello commented 8 months ago

The exports wouldn't be backwards compatible.

I don't follow, could you be more specific? Like the exports of X would not be compatbile with Y?

If you are thinking about the output of tsc in regards towards module systems, it can output for any system, it is just a compiler option. Typescript has been around for 14 years and AFAIK it can be compatible with anything within that timeframe or even before that.

Perhaps my comment above about native support for ESM in node 12+ was confusing, this comment is totally unrelated to the use of typescript.

seb314 commented 8 months ago

tbh I suspect that it might be impossible to satisfy all scenarios given the mess created over time by ts vs js and esm vs common js. Basically, we need:

* the ability to run tests locally (without packaging) using node and ts-node

* the ability to package for node commonjs

* the ability to package for node esm

* the ability to package for browser esm
  I will only consider PRs that provide evidence (i.e. successful tests) that all the above work. Otherwise it's likely that fixing an issue for 1 scenario is actually breaking another scenario...

@ericvergnaud do you have any pointers about how such tests should look like?

Currently I imagine something like: A set of small projects (one for esm vs commonjs, for node vs targeting browsers via webpack) that each depend on the js runtime, generate a parser and parse some string. We could use typescript in each of them and configure it to output for the corresponding target. That way the tsc compilation step of each test project would check that tsc can resolve the typings, and the subsequent invocationg of the parser should serve as a basic thest that the js works at runtime too. (I think we'd need different projects rather than just test cases within ./runtime/JavaScript, in order to make sure that the packaged version works correctly, not just the local sources).

(If there is some tool that can automatically perform these tests, without the need to explicitly setup the test projects, that would be nice of course.)

jonaskello commented 8 months ago

I agree that would be the way to test it. Have tsc consume each scenario and see if it works.

I would add that if tsc is used to create the output for a certain scenario the chance that tsc is able to successfully consume that scenario would be very high, even to the point where tests are not needed. However it is always nice to have tests anyway.

I'm sorry for pushing for tsc to produce the output, I'm just afraid that without allowing appropritate tooling to easily solve this issue it will require such an amount of manual work and bundler config fiddeling that it will not ever get solved.

seb314 commented 8 months ago

@jonaskello I also believe that it would in principle be preferable to have a single source of truth in .ts files. But 1) I believe the resulting js would not be 1:1 identical to the current hand-written js, so it might be hard to know that this refactoring doesn't break some important use case that works right now and 2) there are >100 .js files in the current runtime, so in any case the migration would be a considerable effort*

=> I think it is a more manageable first step to add some tests and then try to fix just the type declarations (mainly add the .js extensions for the imports). That change would help projects that depend on antlr and use typescript, and because we don't even touch the js files, there shouldn't be much risk of breaking any behavior at runtime.

If someone with more js/ts experience than me would like to recommend a better approach, go ahead :-)

KurtGokhan commented 8 months ago

I just run into this problem in a Vite project. I read the whole thread and unless I am missing something, no big changes are needed for this to work.

The issue is only a configuration error. Some fields needs to be added to package.json and that's it. Also there are some extra files in the bundle (like src files, tsconfig, babel, webpack config, tests etc) so I guess it should be ok if they are removed too.

I will open a PR as soon as I can. But I am not sure how I should write tests for it.

KurtGokhan commented 8 months ago

My bad, I just realized that my change fixes the issues with Bundler module resolution but not for NodeNext. File extensions are required after all.

KurtGokhan commented 8 months ago

Sorry for flooding. But there was a simple solution after all. Created PR at #4540

In summary:

The problem was, since the package.json has type: module, Typescript is treating regular d.ts files as module. But the d.ts files are generated manually and they don't comply with how a module file should look like (e.g. missing file extensions on imports). So they are essentially more like CommonJS files.

Luckily, there is a way to override the module detection. If the file has cts or cjs extension, Typescript will treat that file as CommonJS. In this case, adding cts extension only to the index file was enough.

jonaskello commented 8 months ago

@seb Since typescript is a strict superset of javascript, the migration could be as easy as just renaming the files from .js to .ts. This way the output will also be the same as previous js at least for one set of compiler options. The typescript team put a lot of effort into keeping this kind of scenario possible. In addition other compiler options could be used to create different output that is transformed to commonjs etc.

I think not many types from the existing .d.ts files would need to be added to the .ts files since this package heavily uses classes to represent types and therefore will have nominal typing of classes automatically applied by tsc without any additional type annotations needed (as opposed to the structural typing offered by interfaces or type alias).

So migration to ts would mean exactly same js output but the .d.ts would be different since it would be automatically generated by tsc and therefore 100% correct instead of the current broken one.

ericvergnaud commented 8 months ago

Currently I imagine something like: A set of small projects (one for esm vs commonjs, for node vs targeting browsers via webpack) that each depend on the js runtime, generate a parser and parse some string. We could use typescript in each of them and configure it to output for the corresponding target.

Yes we need exactly that. To be more accurate it would have to rely on the webpacked runtime. That would definitely help validate #4540

JesusTheHun commented 8 months ago

@ericvergnaud the doc about community conditions simply says they have to come first. No mention is made about the possibility to nest them.

This simple patch fixes everything :

"exports": {
    ".": {
++    "types": "./src/antlr4/index.d.ts",
      "node": {
--      "types": "./src/antlr4/index.d.ts",
        "import": "./dist/antlr4.node.mjs",
        "require": "./dist/antlr4.node.cjs",
        "default": "./dist/antlr4.node.mjs"
      },
      "browser": {
--      "types": "./src/antlr4/index.d.ts",
        "import": "./dist/antlr4.web.mjs",
        "require": "./dist/antlr4.web.cjs",
        "default": "./dist/antlr4.web.mjs"
      }
    }
  }
ericvergnaud commented 8 months ago

@JesusTheHun thanks. This is yet another proposal... (see #4540 for a different one) Hence why I insist that we have a test suite for ensuring the web packed runtime can be loaded using all the environments we want to support.

Phlosioneer commented 8 months ago

I'd like to suggest another option, if eric thinks it's maintainable: The published project could be split into a commonjs and an emcascript version, each with corresponding typescript. One project should be generated from the other to minimize maintenance/overhead, but splitting it would address some of the mutually exclusive configuration issues. It would also address the backward compatibility concerns. Then, if a common configuration is possible in the future, the split can be re-merged, or the newly created version could be deprecated.

I'm not well versed enough in webpack to know how this would be done. I also don't know if the maintenance burden would be too high for such a solution.


For my ES6 project with Jest, I got everything working with these files. Jest runs, typescript compiles, and the html works. I've removed any lines not critical to getting it working. I'm not using babel or any other frameworks.

jest.config.mjs:

export default {
  testRegex: "(/__tests__/.*|(\\.|/)(test|spec))\\.m?[jt]sx?$",
  transform: {}
};

package.json:

{
  "type": "module",
  "scripts": {
    "test": "node --experimental-vm-modules node_modules/jest/bin/jest.js",
    "antlr": "java.exe -jar antlr-4.13.1-complete.jar -Dlanguage=TypeScript src/grammar/verilog.g4 -o src/grammar"
  },
  "devDependencies": {
    "antlr4": "^4.13.1",
    "jest": "^29.7.0"
  }
}

tsconfig.json:

{
  "compilerOptions": {
    "target": "es2022",
    "module": "ES2022",
    "moduleResolution": "Node",
    "esModuleInterop": true,
    // Can't set to `true`, antlr 4.13.1 outputs ts files that are missing `override` qualifiers on inherited token definitions, like `EOF`.
    "noImplicitOverride": false
}

index.html

    <script type="importmap">
        {
            "imports": {
                "antlr4": "/antlr4.web.mjs"
            }
        }
    </script>
    <script type="module" src="index.js"></script>

(with node_modules/antlr4/build/antlr4.web.mjs copied into the root of the script dir of my website)

A note on Jest: I'm using mjs for test files, and I'm importing the js files that typescript builds in the tests. I'm not using typescript for jest and it has no interaction with tsc.

ericvergnaud commented 8 months ago

@Phlosioneer I don't see much value in this proposal, since it's already achievable: anyone can package the runtime in a way that works best for them. The discussion here is really about how we can get a single package in npm that works for most if not all usage scenarios.

JesusTheHun commented 7 months ago

Hence why I insist that we have a test suite for ensuring the web packed runtime can be loaded using all the environments we want to support.

@ericvergnaud do we have a list of the environments we want to support ?

From the top of my head we have :

Do you have a precise idea on how you want to test this or are you open to suggestions ?

ericvergnaud commented 7 months ago

The above list of environments seems a great starting point. I have some ideas such as using Docker to ensure isolation, but I'm totally open to suggestions

JesusTheHun commented 7 months ago

Docker seems heavy machinery for such small tests. I suggest we create a folder for each environment and run a test that call a small function tied to antlr4. For node tests we don't have to use a test framework, a bash script will do, but for browser tests we'll have to use something like Playwright.

That's how vitest does it for example. Given they test dozens of setups with success for years, I would say it is a reliable approach.

jimidle commented 7 months ago

In my opinion docker is perfect for this

On Mar 26, 2024, at 09:17, Jesus The Hun @.***> wrote:

Docker seems heavy machinery for such small tests. I suggest we create a folder for each environment and run a test that call a small function tied to antlr4. For node tests we don't have to use a test framework, a bash script will do, but for browser tests we'll have to use something like Playwright https://github.com/microsoft/playwright.

That's how vitest https://github.com/vitest-dev/vitest/tree/main/test does it for example. Given they test dozens of setups with success for years, I would say it is a reliable approach.

— Reply to this email directly, view it on GitHub https://github.com/antlr/antlr4/issues/4218#issuecomment-2020717147, or unsubscribe https://github.com/notifications/unsubscribe-auth/AAJ7TMC2IEJXIGN5W5GZAZ3Y2F7RBAVCNFSM6AAAAAAWQXAVNKVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDAMRQG4YTOMJUG4. You are receiving this because you are subscribed to this thread.

ericvergnaud commented 7 months ago

The value of using docker is we can also test various node and tsc versions, which would be very difficult using just folders. But we don't need the full machinery to start with, so if you're more comfortable using folders, let's start with that ? (that will make migrating to Docker later a smaller task)

Indeed we just want to check that the packed runtime can be loaded from the environment being tests, imho a simple call to a function would suffice.

Not sure we need more than the above test, just against a faceless browser ? Antlr4 has no UI so this test is only about code loading. Again if you like vitest, why not ? Can it run in Docker ?

JesusTheHun commented 7 months ago

Docker could eventually be used for local testing instead of installing nvm, but that seems overkill to me. For CI, GitHub Action already offers matrix so we can run the tests with several configurations. Docker or not, we still need a folder per setup.

Browers Now that I think about it, we already have runtime tests for JavaScript, so really what we want is to test that imports work properly. The thing is :

Legacy browsers don't support CommonJS nor ESM, they use third-party library to mimic the behaviour, so not our concern. Modern browsers only support ESM, which has official specs shared with Node.

So really there is nothing to do 🤷‍♂️

ericvergnaud commented 7 months ago

I’m not a big fan of CI matrix approach for this. Here’s why:- it can’t be reproduced locally- each CI build requires pulling all the code, building the tool etc… that’s an enormous overhead for just checking that the webpacked runtime can be loaded and called.Indeed we need folders, so let’s start by making that happen ?Envoyé de mon iPhoneLe 26 mars 2024 à 18:10, Jesus The Hun @.***> a écrit : Docker could eventually be used for local testing instead of installing nvm, but that seems overkill to me. For CI, GitHub Action already offers matrix so we can run the tests with several configurations. Docker or not, we still need a folder per setup. Browers Now that I think about it, we already have runtime tests for JavaScript, so really what we want is to test that imports work properly. The thing is : Legacy browsers don't support CommonJS nor ESM, they use third-party library to mimic the behaviour, so not our concern. Modern browsers only support ESM, which has official specs shared with Node. So really there is nothing to do 🤷‍♂️

—Reply to this email directly, view it on GitHub, or unsubscribe.You are receiving this because you were mentioned.Message ID: @.***>

Codex- commented 7 months ago

it can’t be reproduced locally

https://github.com/nektos/act runs workflows locally, thought just having sub-projects to test the various supported envs is an easier approach to digest

JesusTheHun commented 6 months ago

@ericvergnaud I've submitted a PR. I didn't add browser nor vanilla JS projects, since this is really about TypeScript. WDYT ?

ericvergnaud commented 6 months ago

@ericvergnaud I've submitted a PR. I didn't add browser nor vanilla JS projects, since this is really about TypeScript. WDYT ?

Thanks very much for this. That's a start. If people encounter issues with JS (or any other TS config), then it becomes easy to enhance the test suite accordingly and fix the issue whilst ensuring it doesn't break anything

KurtGokhan commented 4 months ago

Are you guys still looking for the perfect solution?

tpluscode commented 2 weeks ago

This issue should be reopened as PR #4540 did not actually fix the problem.

I think that without adding .js extensions to generated imports and import in the NPM package antlr4 as suggested by @jonaskello, the code still cannot be used in projects with moduleResolution=NodeNext

JesusTheHun commented 2 weeks ago

Well well well... 😂

JesusTheHun commented 2 weeks ago

@parrt @ericvergnaud do you want me to dig out the changes I had offered in #4597 ?