Acorn-based parsing - Githubissues

adams85 commented 7 months ago

It was unavoidable, I had to see how my new acorn-based JS parser works with Jint... 😄

It seems that pretty well: it resolves most of the parser-related issues in test262 and brings a tiny perf. improvement in overall (see benchmark results below).

So, I'm curious to know what do you think about an acorn-based Jint? Of course, absolutely no pressure, plus this would mean some potentially annoying breaking changes, so if you'd rather stick with Esprima, I'm totally fine with that. (Even then this work won't have been done completely in vain as this is a great integration test for my parser, which uncovered some minor bugs and showed me where the AST has shortcomings currently.)

However, if you'd like to do the switch, that would make my very proud and gladly do the minor adjustments which are needed to finalize this PR!

adams85 commented 7 months ago

BenchmarkDotNet v0.13.12, Windows 10 (10.0.19045.4170/22H2/2022Update) AMD Ryzen 7 2700X, 1 CPU, 16 logical and 8 physical cores .NET SDK 8.0.202 [Host] : .NET 8.0.3 (8.0.324.11423), X64 RyuJIT AVX2 DefaultJob : .NET 8.0.3 (8.0.324.11423), X64 RyuJIT AVX2

Before (Esprima)

Method	FileName	Mean	StdDev	Rank	Allocated
Jint	array-stress	7,515.224 μs	42.4682 μs	1	7017.84 KB
Jint_ParsedScript	array-stress	8,157.611 μs	115.4234 μs	2	6996.66 KB

Jint	dromaeo-3d-cube	28,390.069 μs	42.0761 μs	4	6241.11 KB
Jint_ParsedScript	dromaeo-3d-cube	27,196.405 μs	34.6054 μs	3	5983.8 KB

Jint	dromaeo-core-eval	4,727.442 μs	13.2962 μs	3	355.79 KB
Jint_ParsedScript	dromaeo-core-eval	4,475.208 μs	13.5812 μs	2	338.75 KB

Jint	dromaeo-object-array	53,183.217 μs	784.0281 μs	1	100410.77 KB
Jint_ParsedScript	dromaeo-object-array	56,760.258 μs	930.0847 μs	2	100370.95 KB

Jint	droma(...)egexp [21]	229,861.280 μs	2,785.7601 μs	2	163402.02 KB
Jint_ParsedScript	droma(...)egexp [21]	222,161.400 μs	2,183.8414 μs	1	169953.97 KB

Jint	droma(...)tring [21]	404,473.279 μs	3,376.5817 μs	1	1321472.77 KB
Jint_ParsedScript	droma(...)tring [21]	398,929.113 μs	5,418.5646 μs	1	1321261.68 KB

Jint	droma(...)ase64 [21]	60,223.052 μs	63.6666 μs	3	6124.7 KB
Jint_ParsedScript	droma(...)ase64 [21]	59,242.062 μs	120.8232 μs	2	6039.83 KB

Jint	evaluation	36.788 μs	0.6763 μs	2	40.85 KB
Jint_ParsedScript	evaluation	15.226 μs	0.1624 μs	1	32.28 KB

Jint	linq-js	2,483.586 μs	6.9467 μs	3	1273.64 KB
Jint_ParsedScript	linq-js	141.609 μs	1.8270 μs	1	224.88 KB

Jint	minimal	5.802 μs	0.2169 μs	3	14.6 KB
Jint_ParsedScript	minimal	3.164 μs	0.0050 μs	1	13.22 KB

Jint	stopwatch	389,799.457 μs	2,555.6312 μs	4	53042.25 KB
Jint_ParsedScript	stopwatch	385,872.907 μs	2,871.8493 μs	4	53017.58 KB	Before (Esprima):

After (Acornima)

Method	FileName	Mean	StdDev	Rank	Allocated
Jint	array-stress	7,493.907 μs	55.6531 μs	1	7016.32 KB
Jint_ParsedScript	array-stress	8,521.961 μs	111.3640 μs	3	6994.32 KB

Jint	dromaeo-3d-cube	28,166.099 μs	27.7357 μs	4	6234.68 KB
Jint_ParsedScript	dromaeo-3d-cube	26,540.061 μs	88.2389 μs	3	5981.46 KB

Jint	dromaeo-core-eval	4,642.193 μs	22.5442 μs	3	357.3 KB
Jint_ParsedScript	dromaeo-core-eval	4,494.251 μs	7.5824 μs	2	339.72 KB

Jint	dromaeo-object-array	52,620.869 μs	1,219.8760 μs	1	100411 KB
Jint_ParsedScript	dromaeo-object-array	53,944.009 μs	1,143.7667 μs	2	100368.6 KB

Jint	droma(...)egexp [21]	233,301.670 μs	3,384.7220 μs	2	164630.45 KB
Jint_ParsedScript	droma(...)egexp [21]	225,059.003 μs	2,321.0998 μs	1	165143.63 KB

Jint	droma(...)tring [21]	403,190.871 μs	5,494.1274 μs	1	1321527.96 KB
Jint_ParsedScript	droma(...)tring [21]	397,381.593 μs	5,916.8257 μs	1	1321348.63 KB

Jint	droma(...)ase64 [21]	59,188.533 μs	142.2596 μs	2	6119.13 KB
Jint_ParsedScript	droma(...)ase64 [21]	59,294.962 μs	126.4828 μs	2	6036.86 KB

Jint	evaluation	32.682 μs	0.0295 μs	2	38.71 KB
Jint_ParsedScript	evaluation	14.002 μs	0.0705 μs	1	29.95 KB

Jint	linq-js	2,405.574 μs	2.3482 μs	3	1327.49 KB
Jint_ParsedScript	linq-js	135.846 μs	2.2389 μs	1	222.54 KB

Jint	minimal	4.376 μs	0.0120 μs	2	13.45 KB
Jint_ParsedScript	minimal	3.029 μs	0.0046 μs	1	12.09 KB

Jint	stopwatch	382,592.787 μs	3,267.6964 μs	4	53041.01 KB
Jint_ParsedScript	stopwatch	377,164.633 μs	1,199.8201 μs	4	53015.24 KB

ejsmith commented 7 months ago

Would this give Jint better error messages with line and column information?

adams85 commented 7 months ago

The current parser, Esprima already provides this information (at least for syntax errors). So, if it's not reported, then it's because of Jint...

As for the new parser, I paid careful attention to error reporting. It pretty closely replicates V8's error reporting behavior, plus it detects a lot of syntax errors which Esprima is currently unable to, so it would be nice if Jint were propagate those errors properly. I may look into this in a future PR - if this one gets accepted, of course. :)

lahma commented 7 months ago

Awesome work @adams85 as always! We are very interested in this option and would offer better long time support as acorn-based solution is better maintained w.r.t. features. Another question indeed is when we could make the switch, probably would require v4 and v3 only took seven years to "complete" 😉

lahma commented 7 months ago

Would this give Jint better error messages with line and column information?

Jint has extensive test cases for error reporting on line/column level. There are still a lot of type errors thrown with lacking exception messages, but we take pull requests from people who need them to be improved.

lahma commented 7 months ago

To reward you hard work, I've created a big conflicting change for you in main 😉 With #1824 you can probably reduce the size of this PR quite a lot by just updating the the global usings/aliases in one file.

adams85 commented 7 months ago

Awesome work @adams85 as always! We are very interested in this option and would offer better long time support as acorn-based solution is better maintained w.r.t. features.

❤️

Another question indeed is when we could make the switch, probably would require v4 and v3 only took seven years to "complete" 😉

Even seven years sounds better than never... 😄 It would be nice to merge this PR, so V4 features can be created based on this, but it's nearly irrelevant to me when it gets released. So, absolutely no hurry. I'd just like to avoid constantly rebasing this one...

So it would definitely be nice to find a clever solution to reduce friction with other open PRs. Two options come to my mind:

We create a v4 branch and merge this one into that and develop v4 features on top of that. Maybe retarget some open PRs which we want to release in v4.
Do some implicit using / using alias magic to reduce friction and merge this one into master.

Option 1 (or some combination of the two) seems more flexible to me. But this is up to you.

Just a though, what if there was a pre-PR that would introduce global usings for Esprima and it would then make this one smaller and maybe less against v3 development before merge?

AFAICS, this is done deal :D I like the idea very much. Probably we can improve this even further.

What about (temporary) keeping the original names (Nodes, Location, etc.) with a using alias instead of changing them? This way we wouldn't introduce a lot of conflicts to other PRs.

Also, we can add further using aliases:

using JavaScriptParser = Acornima.Parser
using TokenType = Acornima.TokenKind
using BindingPattern = Acornima.Ast.DestructuringPattern
using ParserException = Acornima.ParseErrorException

lahma commented 7 months ago

I think the new names are fine as these are parts where nobody usually contributes anyway 😉 I can add those extra ones you are proposing.

lahma commented 7 months ago

main now contains the proposed type aliases, but reversed in the sense that Acornima type names are being used.

adams85 commented 7 months ago

Phew, this was no fun but finally I managed to rebase the PR. It's much much cleaner this way, no doubt about it. So it was worth it in the end :)

However, while doing this, I spotted something pretty problematic.

Acornima's tolerant parsing mode differs from Esprima's, plus the latter has tolerant parsing turned on while the former has it off by default. Jint used the defaults of Esprima, so I enabled tolerant mode for internal parsing (which I don't think is a good default - but that's another story). The main problem here is that obviously I can't do anything about pre-parsed scripts - which is a pretty sneaky BC...

On top of this, things are really starting to fall apart when Jint needs to parse some code dynamically (e.g. some eval code). At that point we don't have the information whether the containing script was parsed in tolerant or non-tolerant mode. So far Jint's just used the default parser settings. Because of the differences in the tolerant mode behavior I could make tests pass only if I disabled tolerant mode. So, it kinda works but meh...

This is, of course, not something Acornima-specific. The current Esprima version has this problem as well. (For example, end user passes a pre-parsed script to Jint which has been parsed in non-tolerant mode, but Jint will parse dynamic parts in tolerant mode...) With Acornima this issue gets even worse, as it has support for e.g. setting the language version compatibility (EcmaVersion) and fine tuning which experimental features are enabled (ExperimentalESFeatures).

So the current situation is pretty messy. It could be improved easily if Jint handled the parsing internally in all circumstances. However, since it allows end user to pass pre-parsed scripts, this can't be solved within Jint.

The only correct solution to this issue I can think of (and doesn't involve big BCs) is to modify the parser to include the ParserOptions used for parsing in the returned AST (or some struct return value). This way, Jint would know what parser options (tolerant mode, language version, etc.) to use to stay consistent when it needs to parse eval code, import a module, etc.

What do you think? It wouldn't be a big deal to add this to Acornima, I just doesn't feel like re-introducing SyntaxElement... :) I'd be more ok with something like ParseResult ParseScript(string input, string? sourceFile = null, bool strict = false), where ParseResult is a readonly struct which includes the root node of the AST and the ParserOptions used for parsing. Then, Engine.Evaluate would expect this ParseResult type instead of Script.

adams85 commented 7 months ago

After familiarizing myself a bit more with Jint's public API, I may see another way that doesn't involve changes to the parser:

The root cause of the problem is that end users are allowed to pass pre-parsed code to Jint via Evaluate(Script)/Execute(Script)/AddModule(Module), which means that Jint can't make any assumptions on what parser settings were used to produce those ASTs.

However, from what I understood, the way of using pre-parsed scripts with Jint is to call PrepareScript/PrepareModule, then to feed the return value to the methods mentioned above.

So, if my understanding is correct, we could get away with much smaller changes:

+ public readonly record struct PrepareResult<T>(T Program, ParserOptions ParserOptions) where T : Program;

- public static Script PrepareScript(string script, string? source = null, bool strict = false)
+ public static PrepareResult<Script> PrepareScript(string script, string? source = null, bool strict = false, bool tolerant = true)

- public JsValue Evaluate(Script script)
+ public JsValue Evaluate(PrepareResult<Script> script)

etc.

This way Jint could retain full control over what parser settings are used for parsing the code to execute and it can store the settings for later use when dynamic code, code of referenced modules, etc. need to be parsed.

Can this work in your opinion?

lahma commented 7 months ago

This was exactly my initial thought that we could enforce input to be something that PrepareScript produces 👍🏻

Traveling today so mostly short replies and tinkering as background process 🙂

adams85 commented 7 months ago

Okay, that's enough for now, I can start working with this.

Just one more question: how about changing tolerant mode to off by default? I can only speak for myself, but I'd be surprised if I fed some script to Jint which is invalid per spec and get some results instead of SyntaxError. I mean I think relaxing the rules should be a deliberate choice, not the default. Or am I missing here something that justifies the opposite?

Of course, this would be another potentially impactful BC, but we'll have a bunch apart from this anyway...

lahma commented 7 months ago

Probably would be OK to make rules stricter. Maybe return outside of functions is something needed for REPL and environments where people want script like if they were inside function context.

adams85 commented 7 months ago

Maybe return outside of functions is something needed for REPL and environments where people want script like if they were inside function context.

No problem with that as it's controlled by another switch than tolerant mode.

adams85 commented 7 months ago

Had to dive much deeper into Jint's internals than I had hoped, but I think finally I was able to clean up the issues around parser options and dynamic code parsing.

But there are still a few questions I'm unsure about:

It seems I ran into a bug: I wrote a test where a shadow realm executed a nested eval and Jint crashed with NullReferenceException at this line. I could make it work with this change but I'm not sure at all that it's correct. Could you confirm it's ok?
There are settings in ParserOptions which should not be tweaked by the end user, just by Jint, internally. Without a massive BC, I could come up with this solution. It works, however I'm not sure whether to silently ignore the nonsensical settings (like it's done now) or to throw an exception instead. What do you think?

One more thing worth mentioning: I included this workaround in the regexp parser, so it's not needed here any more.

lahma commented 7 months ago

Had to dive much deeper into Jint's internals than I had hoped, but I think finally I was able to clean up the issues around parser options and dynamic code parsing.

It's always good to get a new fresh pair of eyes to look at years of hacks and horror 🙂

It seems I ran into a bug: I wrote a test where a shadow realm executed a nested eval and Jint crashed with NullReferenceException at this line. I could make it work with this change but I'm not sure at all that it's correct. Could you confirm it's ok?

Well you're plain evil. This is the thing with JS, there's so many ways to get things in unusable state, great work on spotting this! This whole evaluation context going missing has bitten so many times already that I'm already questioning to whole idea of having it. I think the fix is fine and the regular band-aid that is being used elsewhere too (other places don't have try-finally IIRC).

There are settings in ParserOptions which should not be tweaked by the end user, just by Jint, internally. Without a massive BC, I could come up with this solution. It works, however I'm not sure whether to silently ignore the nonsensical settings (like it's done now) or to throw an exception instead. What do you think?

Just to avoid answering the actual question... Should Jint actually take JintParserOptions which would have only couple sensical properties which would then be transferred to the implementation dependent richer configuration? By default I think Jint should have always all the bells and whistles enabled for parsing and trying to cope with the AST. Both Esprima and Acornima settings objects might be too rich for end user. It could be very small subset of switches which could be expanded when/if there's a need. We could already break this with 3.1 or something and narrow the configuration. I don't see a mention about semantic versioning in the readme 😉

One more thing worth mentioning: I included this workaround in the regexp parser, so it's not needed here any more.

Ah so basically a flag for Jint whether is can even try the compilation. I see that Acornima limits to < 7, but I think it's safe to use again with > 8 too. Jin does the aggressive compilation even if there's no JS regex flag for compiled like you probably know.

adams85 commented 7 months ago

Well you're plain evil. This is the thing with JS, there's so many ways to get things in unusable state, great work on spotting this!

Yep, JS is the language of surprises, so minor glitches like this are indeed hard to spot. They align so smoothly with the JS philosophy :D

I think the fix is fine and the regular band-aid that is being used elsewhere too (other places don't have try-finally IIRC).

The reason why I asked is that in other cases I've only seen this pattern:

var ownsContext = _engine._activeEvaluationContext is null;
_engine. _activeEvaluationContext ??= new EvaluationContext(_engine);
try
{
    /* ... */
}
finally
{
    if (ownsContext)
    {
        _engine._activeEvaluationContext = null!;
    }
}

This is a bit different from what I did because it kind of "joins" the evaluation context if it's present already. So, don't we actually need this also in the case in question? (Unfortunately, I can't really tell because of my limited understanding of the concept.)

Should Jint actually take JintParserOptions which would have only couple sensical properties which would then be transferred to the implementation dependent richer configuration?

Yeah, this is the other solution that I referred to as a "massive BC". At the same time this is the clean solution, so I'd prefer it too if BCs are not a concern. However, I think we should introduce this change in a subsequent PR because this one is starting to lose focus... I'll be happy to do the change after merging this.

I see that Acornima limits to < 7, but I think it's safe to use again with > 8 too. Jin does the aggressive compilation even if there's no JS regex flag for compiled like you probably know.

Indeed, I missed that the .NET runtime folks fixed this here. So we can assume that the fix will be included in .NET 9. I'll update the regexp parser accordingly.

adams85 commented 6 months ago

As far as I'm concerned, the PR is complete. Also, v1.0.0 of the parser has been released. So what do you think about merging this into some v4 branch so further development can be done in that? (Probably other ongoing PRs which are to be released in v4 should be rebased on that branch too.)

For example, I plan to address some minor todos and issues and I'll also want to look into error reporting because syntax errors reported by the parser are not wrapped in SyntaxError JS objects in some cases (e.g. here). However, I think it would be better to put those changes into seperate PRs.

lahma commented 6 months ago

My attack would probably be as follows

rename EsprimaExtensions to AstExtensions (rarely used by consumer code), this branch should rebase cleanly just by ignoring it and overwriting with PR's changes - parity with upcoming changes, we already are breaking things
bump version to 3.1
release 3.1
branch main to 3.x, maintain fixes if needed there
merge this to main and make main a Duke Nukem forever project again
profit

adams85 commented 6 months ago

That can certainly work as well! :)

rename EsprimaExtensions to AstExtensions (rarely used by consumer code), this branch should rebase cleanly just by ignoring it and overwriting with PR's changes - parity with upcoming changes, we already are breaking things

I'll do this soon, then leave the rest up to you. ;)

adams85 commented 6 months ago

Latest benchmark result with main branch and this PR updated to Acornima v1.0.0:

Before (Esprima)

Method	FileName	Mean	StdDev	Rank	Allocated
Jint	array-stress	7,575.246 μs	14.4506 μs	1	7016.28 KB
Jint_ParsedScript	array-stress	8,020.207 μs	118.9083 μs	2	6982.47 KB

Jint	dromaeo-3d-cube	28,279.776 μs	57.0615 μs	1	6240.53 KB
Jint_ParsedScript	dromaeo-3d-cube	28,020.723 μs	46.3408 μs	1	5885.77 KB

Jint	dromaeo-core-eval	4,584.428 μs	11.0669 μs	1	355.94 KB
Jint_ParsedScript	dromaeo-core-eval	4,769.521 μs	10.6490 μs	2	333.21 KB

Jint	dromaeo-object-array	50,683.045 μs	413.7656 μs	1	100409.92 KB
Jint_ParsedScript	dromaeo-object-array	52,291.036 μs	316.6022 μs	2	100356.51 KB

Jint	droma(...)egexp [21]	225,931.847 μs	2,904.9614 μs	2	165651.12 KB
Jint_ParsedScript	droma(...)egexp [21]	174,394.089 μs	2,339.3612 μs	1	167267.43 KB

Jint	droma(...)tring [21]	399,448.584 μs	8,581.2445 μs	1	1321367.52 KB
Jint_ParsedScript	droma(...)tring [21]	398,350.529 μs	4,534.8665 μs	1	1321354.86 KB

Jint	droma(...)ase64 [21]	59,212.169 μs	308.0397 μs	1	6123.97 KB
Jint_ParsedScript	droma(...)ase64 [21]	60,402.459 μs	177.9859 μs	2	6012.02 KB

Jint	evaluation	35.746 μs	0.0466 μs	2	39.39 KB
Jint_ParsedScript	evaluation	12.476 μs	0.4245 μs	1	28.01 KB

Jint	linq-js	2,561.289 μs	16.2517 μs	2	1272.12 KB
Jint_ParsedScript	linq-js	126.997 μs	1.3896 μs	1	192.3 KB

Jint	minimal	5.361 μs	0.1053 μs	2	14.66 KB
Jint_ParsedScript	minimal	2.475 μs	0.0021 μs	1	12.59 KB

Jint	stopwatch	378,419.050 μs	1,603.6123 μs	1	53040.73 KB
Jint_ParsedScript	stopwatch	390,796.180 μs	1,204.0319 μs	2	53005.7 KB

After (Acornima)

Method	FileName	Mean	StdDev	Rank	Allocated
Jint	array-stress	7,377.878 μs	38.4145 μs	1	7015.58 KB
Jint_ParsedScript	array-stress	8,453.743 μs	123.7822 μs	2	6981.33 KB

Jint	dromaeo-3d-cube	28,350.037 μs	24.6356 μs	2	6233.65 KB
Jint_ParsedScript	dromaeo-3d-cube	27,428.668 μs	47.7358 μs	1	5884.62 KB

Jint	dromaeo-core-eval	4,686.884 μs	6.6578 μs	2	355.73 KB
Jint_ParsedScript	dromaeo-core-eval	4,544.569 μs	12.6899 μs	1	332.57 KB

Jint	dromaeo-object-array	50,250.130 μs	391.0685 μs	1	100409.84 KB
Jint_ParsedScript	dromaeo-object-array	62,619.985 μs	321.5383 μs	2	100355.15 KB

Jint	droma(...)egexp [21]	227,611.223 μs	3,594.9438 μs	2	168624.43 KB
Jint_ParsedScript	droma(...)egexp [21]	177,574.193 μs	2,830.8412 μs	1	165976.58 KB

Jint	droma(...)tring [21]	400,274.943 μs	9,084.8877 μs	1	1321360.98 KB
Jint_ParsedScript	droma(...)tring [21]	404,169.764 μs	6,067.2517 μs	1	1321322.06 KB

Jint	droma(...)ase64 [21]	57,840.141 μs	335.3168 μs	1	6119.98 KB
Jint_ParsedScript	droma(...)ase64 [21]	59,283.333 μs	217.2524 μs	2	6010.57 KB

Jint	evaluation	32.130 μs	0.0499 μs	2	38.45 KB
Jint_ParsedScript	evaluation	10.904 μs	0.0861 μs	1	26.86 KB

Jint	linq-js	2,352.298 μs	4.5143 μs	2	1307.63 KB
Jint_ParsedScript	linq-js	121.317 μs	1.7267 μs	1	191.15 KB

Jint	minimal	4.339 μs	0.0079 μs	2	13.49 KB
Jint_ParsedScript	minimal	2.328 μs	0.0037 μs	1	11.45 KB

Jint	stopwatch	381,197.513 μs	2,802.2481 μs	1	53040.95 KB
Jint_ParsedScript	stopwatch	394,633.258 μs	1,135.1463 μs	2	53004.55 KB

adams85 commented 6 months ago

I made a few other runs and the regression in the case of Jint_ParsedScript + droma(...)ase64 [21] looks like just the inaccuracy of the benchmarks:

Method	FileName	Mean	StdDev	Rank	Allocated
Jint	dromaeo-object-array	49.78 ms	0.179 ms	1	98.06 MB
Jint_ParsedScript	dromaeo-object-array	51.20 ms	0.357 ms	2	98 MB

Method	FileName	Mean	StdDev	Rank	Allocated
Jint	dromaeo-object-array	50.25 ms	0.204 ms	1	98.06 MB
Jint_ParsedScript	dromaeo-object-array	51.72 ms	0.392 ms	2	98 MB

Probably there's some GC activity behind the differences...

adams85 commented 6 months ago

Compared to where we started from, the perf. characteristics virtually haven't changed. Looks like the new parser may bring a tiny perf. improvement in cases where parsing the input takes up a relatively larger part of the execution.

lahma commented 6 months ago

Compared to where we started from, the perf. characteristics virtually haven't changed. Looks like the new parser may bring a tiny perf. improvement in cases where parsing the input takes up a relatively larger part of the execution.

This is quite expected. While Acornima is faster, the majority of time in Jint's use case is always spent in actual script execution/interpretation (performance PRs welcome! 😉). The major improvement is to have a clear(er) path to new ECMAScript features with robust parser infrastructure.

Version 3.1 is being out for a moment and some have already auto-updated like docfx without problems. I've tested against other code bases and only small tweaks needed for consumers like RavenDB. We've broken advanced APIs mostly so people don't notice..

Thanks for championing this!

sebastienros commented 6 months ago

😢

lahma commented 6 months ago

😢

First tears of joy given!

sebastienros commented 6 months ago

I have memories of the summer I spent migrating the Esprima codebase to dotnet. Now the question is when to flag the repository as not maintained, because I assume nobody would, and suggest people to also move to Acornima, which will also help driving support.

lahma commented 6 months ago

I have memories of the summer I spent migrating the Esprima codebase to dotnet.

I also have memories of lost summer vacations building/improving these things. "Unfortunately" it seems that Acorn won the battle maintenance-wise and is a much safer bet. Maybe we can be happy about the things we learned along the way, whether it's spec, performance or other things 🙂

Now the question is when to flag the repository as not maintained, because I assume nobody would, and suggest people to also move to Acornima, which will also help driving support.

It's an option. Esprima is a good solution which can handle many things - it's been great for Jint, it's just harder to get new features and compliance with newer features without an upstream to follow.

adams85 commented 6 months ago

No reason to be sad, we can now waste many more summer vacations on keeping Acornima up to date :D Every Esprima maintainers are invited. @lahma is already a collaborator ;) If anyone else wants to join, just drop me a message!

Besides that, a big part of Esprima will live on in Acornima. Many of the awesome performance tweaks have been applied to translated Acorn implementation. A lot of add-on features, source gen, tests also come from Esprima. So best to think about Acornima as the love child of two great parents :D

lahma commented 6 months ago

Nicely summed up, thank you!

As this has been merged, main is again "API unstable", even though I don't see much required changed there (I can also be blind at this point looking at the APIs too long). Breaking changes mean It's just that the new version will require 4.0 stamp due to dependency change.

adams85 commented 6 months ago

FWIW, I put together a list of the most impactful breaking changes: https://github.com/adams85/acornima?tab=readme-ov-file#migration-from-esprimanet

Most surprises are to be expected when users are feeding ASTs which were built manually, not by the parser. Not sure how typical a use case this is, just saying that some extra attention is required in such cases.

Apart from that, parsing has become stricter because it's become more compliant with the spec, plus because tolerant mode has been changed to off by default. (Acornima's tolerant mode is different and not as permissive as Esprima's, so it wouldn't have made much sense to keep that default. (Users who relied on tolerant mode would mostly be broken anyway. Top level returns in scripts are still allowed. That behavior hasn't been changed.)

Finally, there are some changes to be aware of in error reporting too. For one, the format of syntax errors has changed significantly. (There are some issues around converting parser syntax errors to Jint SyntaxError objects. Which might be not a new issue, but I plan to look into it because it's not very nice this way...)

lahma commented 6 months ago

Because adding notes to a closed PR is definitely a best practice, some things:

docfx: ParserException missing -> 25>PreprocessorWithResourcePool.cs(5,7): Error CS0246 : The type or namespace name 'Esprima' could not be found (are you missing a using directive or an assembly reference?)
EventStore: Assert.IsInstanceOf<Esprima.ParserException>(ex);
JavaScriptEngineSwitcher: probably most about changed namespaces/names
RavenDB, some manual creation of AST, requires changes to instantiated types

sebastienros / jint

Acorn-based parsing #1820