Chevrotain / chevrotain

Parser Building Toolkit for JavaScript
https://chevrotain.io
Apache License 2.0
2.51k stars 206 forks source link

Remove (or Replace) lodash dependency to reduce bundle size? #1870

Closed joliss closed 1 year ago

joliss commented 2 years ago

I wonder what you thought about the idea of using native JavaScript functions instead of lodash?

In my minified bundle, lodash takes up 41 KB in addition to Chevrotain's 136 KB, so it seems that this would probably be the biggest opportunity for reducing bundle size (#1697).

JavaScript has gained a lot of the utility functions provided by lodash in the past 10 years, so this change might be a lot more doable than it was a few years ago.

It might also be worth bumping the minimum JavaScript version to something like ES2020 in the same go, so that we get access to most of the new functions. ES2020 is currently compatible with ~95% of global web users' browsers, down from 98% for ES2015.

mattbishop commented 2 years ago

Great points! Do you think you would be able to create a PR with these replacements?

joliss commented 2 years ago

Sure, I'd volunteer a PR. I think having a minimum ES version to target (preferably larger than 2015) would be the main thing I need to get going.

bd82 commented 2 years ago

Lodash was recently intentionally introduced as a replacement for most of the utility functions.

At the time I saw a 15-20% increase in bundle size which I considered an acceptable tradeoff as reducing the project's code size / complexity is of a higher priority for me in order to reduce the maintenance overhead...

bd82 commented 2 years ago

In regards to the compilation target. I recently started working on a switch to ESM modules.

That turned out a lot bigger than initially expected and is currently shelved until I have more time to invest in it.

mattbishop commented 2 years ago

Do we need to switch to ESM to support ES2015? I don't think so?

@joliss what lodash functions in use could be replaced by native 2015 methods? That would help understand the size of the change. It might be that only one or two utility methods would need to be re-introduced.

@bd82 do you have a minimum ES language target? 2015 is 7 years ago, surprisingly!

bd82 commented 2 years ago

do you have a minimum ES language target? 2015 is 7 years ago, surprisingly!

Whatever works on the oldest still supported nodejs version and supported in major greenfield browsers.

joliss commented 2 years ago

Lodash was recently intentionally introduced as a replacement for most of the utility functions.

I think the switch to lodash is actually really useful groundwork for this change, as now I can just grep for lodash and change everything systematically.

I recently started working on a switch to ESM modules.

I wasn't looking to change the compile target, but rather just to bump the advertised compatible ES version.

Whatever works on the oldest still supported nodejs version and supported in major greenfield browsers.

Ok, Node 14 (the oldest supported LTS) seems to broadly support ES2020. (If there's anything in ES2020 that isn't supported, the CI on Node 14 should catch it.) All the greenfield browsers seem to support ES2020 as well. So if you're happy with ES2020, I'll use that as the target and prepare a PR.

@mattbishop wrote:

what lodash functions in use could be replaced by native 2015 methods? That would help understand the size of the change. It might be that only one or two utility methods would need to be re-introduced.

I think most all of them can be replaced by a single native function call or a one-liner. So "one or two utility methods" is what I'd expect as well.

bd82 commented 2 years ago

I wasn't looking to change the compile target, but rather just to bump the advertised compatible ES version.

What is the value proposition in that? this would be "officially" a breaking change...

Hmm, which built-in functions do you need? what versions were they added? It seems ES2020 would be safe to use once nodejs 14 reaches EOL (April 2023).

For such a PR to be accepted the new code (including tests) would have to be minimal and so would the overall complexity.

sidharthv96 commented 1 year ago

@bd82 seems like @joliss's PR was merged into https://github.com/Chevrotain/chevrotain/tree/remove_lodash, but not released. We're trying to switch MermaidJS' parser to Langium, which uses chevrotain.

Bundle size is a big concern for us and would really appreciate it if @joliss' changes make it to a release.

Perhaps you were waiting for NodeJS 14 EOL (April 2023)?

bd82 commented 1 year ago

Hello @sidharthv96

I am a-lot less active on Chevrotain recently... I will try taking a peek at this in the next couple of weeks, but I can't make any promises...

sidharthv96 commented 1 year ago

@bd82 thanks for creating such a wonderful tool and also for the prompt response. I've created a PR https://github.com/Chevrotain/chevrotain/pull/1942 which you can review.


I tried out almost all available grammar tools and none checked all the boxes like Chevrotain did. We started work on the migration last year with https://github.com/mermaid-js/mermaid/pull/3432, which is superseded by https://github.com/mermaid-js/mermaid/pull/4450 using langium (chevrotain under the hood).

I genuinely understand shifting priorities and that we don't have any obligations for maintaining previous work. Moreover, things like CJS vs ESM and export maps are very frustrating things to do, outside our core product (I've personally dealt with it in Mermaid).

But as a potential user of chevrotain, we were wondering what the redundancy plans are for chevrotain, do you have other trusted maintainers who could help with the maintenance? If not, could we discuss something for this please?

bd82 commented 1 year ago

@sidharthv96 you are welcome 😄

But as a potential user of chevrotain, we were wondering what the redundancy plans are for chevrotain, do you have other trusted maintainers who could help with the maintenance? If not, could we discuss something for this please?

msujew from langium has write permissions to this repo and helps mainly by answering questions on the discussions tab and maintaining the chevrotain-allstar plugin used in langium.

I would definitively not mind having additional maintainers, but I am not sure how to achieve this... The project is in the "maintenance" phase of the life-cycle with most of the remaining work not being particularly glamorous, e.g:

I have a little more free time recently and will try to push some of these items...

bd82 commented 1 year ago

All-right @sidharthv96 @joliss I have status update:

Removing lodash with ECMAScript built-ins #1949:

I'm not going to merge this.

I've invested a few hours into reviewing this and beginning code review fixes. However, I've encountered several issues:

  1. Lexer performance reduced by 5% in JSON lexer benchmark.

    • Strange because I thought such utils were avoided in hot-spots.
  2. Readability decreased imho,

  3. I ended up re-implementing some library utils for readability which is what I wanted to avoid in the first place when I introduced lodash.

  4. indentation changes (unavoidable) are causing larger than optimal diffs to review.

Replacing lodash with just-* libraries #1950

Just-* is a minimalist and tiny implementation of lodash like utils.

This approach is temporarily shelved until I can evaluate what I suspect to be a better approach see below.

Basically the idea is to replace only the from clause of import statements, e.g:

However, I've encountered several issues:

I am also a little wary of performance implications, as performance is not one of the goals of just-* And while these library utils are not supposed to be used in runtime(parsing) hot-spots, they are definitively used in "design time" analysis (construction) of the parser instance.

Considering the above points and Looking into just-* tradeoffs it seems as though just-* may not be the "best tool for job"

bd82 commented 1 year ago

Replacing lodash with Remda

Remda is:

So it seems to match all the requirements and may reduce bundle size by ~15% (I hope...) while keeping the benefits of a "standard library".

The one blocker is that it requires ES6 (or polyfills). So I will first need to do a breaking-change and change the compilation target.

bd82 commented 1 year ago

In #1952 there is a POC to replace lodash with remeda This seems to reduce bundle size (.min) by ~20Kb. However, it also causes a performance regression in parser/lexer initialization time.

Perhaps the best approach would be to re-implement these utilities inside chevrotain

Normally I would not consider it worth-while just to save 20-30Kb of bundle size. However, if native implementation can also provide a significant init speed boost, then it may be worth exploring

bd82 commented 1 year ago

The following steps reduced the *.min bundle size to ~136K-140K: (umd/esm)

These will be released in version 11.0.0 as they are breaking changes.

mattbishop commented 11 months ago

I know this is closed, but still persists as an issue folks would like to solve.

What about replacing lodash functions individually rather than a large all-in-one PR? For instance, replace _.isArray() with Array.isArray()? This would focus the remaining replacements that are difficult to replace (like _.uniq()) down to a more manageable size.

bd82 commented 11 months ago

Hello @mattbishop

For me this is past the point of diminishing returns. I am not planning on investing additional effort into this topic. Nor am I interested in accepting the compromises / inconsistencies of hybrid solutions or implementing the relevant utils directly.