Open walterra opened 1 year ago
:wave: Hi there! :wave:
And thank you for opening your first issue! We will get back to you shortly. :runner: :dash:
Hey, @walterra! Thanks for reaching out! To help us recommend the best approach, how much of stdlib
are you wanting to use? Many packages or a few, and what is your intended use case?
Hey, thanks for getting back so quickly! For now we are looking into using ndarray
and chi2test
. If we can make it to bring down bundle sizes and make the library consumable via esm we might pick up more in the future.
I noticed that each individual package has an esm
branch but unfortunately those are not published to npm
but as the docs mention intended to be consumed via script
tags and will fetch internal dependencies on demand, so it looks like they are not bundles containing all the code for a package.
We're evaluating if we're able to pick up the library to be used in parts of Kibana (https://github.com/elastic/kibana).
Thanks for the info! Let me discuss internally with the team and circle back!
@walterra To help provide us with a bit more context, how are you currently handling CommonJS packages in your production workflow? Are you wanting esm variants because that's what your tooling caters to, or are there specific issues you encountered with CommonJS stdlib
standalone packages?
We can handle CommonJS packages and got it working with stdlibjs
, the problem was that just adding @stdlib/ndarray
and @stdlib/stats-chi2test
as is to our package.json
added optimized 500KB+ to our client side bundle sizes which unfortunately is not feasible for us.
We then had a go at your custom bundler which resulted in a 1.3MB package for those 2 libraries. Visualized here with source-map-explorer
:
Next we tried with a webpack setup which was able to bring that down to ~173KB. I think the following is just for the stats-chi2test
as an entry point but you can see in the treemap that it includes ndarray
too:
Then we found your esm repo and using rollup with the esm version allowed us to bring down the chi2test
package to ~125KB:
We saw that you seem to make use of rollup
in the esm
branch too.
If you'd be able to pull off releasing individual minified/optimized packages like this on npm that would certainly hit a sweet spot! But I'm not sure it's necessary. If the current esm repo was updated and published to npm we could use that and do deep imports from within our code with the same effect. If I'm not mistaken that's also lodash's approach: Individual npm packages for CommonJS and a single esm package. Not sure if hybrid package releases are an option at this stage, I'm a bit out of the loop regarding the current ecosystem 😅 .
Let me know if you need more information or if we can be of any help.
Thanks for the detailed write-up!
@walterra Did you try just using the ndarray/*
packages that you needed (e.g., @stdlib/ndarray/array
or @stdlib/ndarray/ctor
)? Loading @stdlib/ndarray
loads the entire namespace, which, without knowing your use case, seems likely not necessary.
Ah, okay. I see from the source map explorer images that you don't seem to be loading the @stdlib/ndarray
namespace.
I did some more testing with bundling. The initial bundle increases (1MB+) were part of a quite large PR that included a lot of other code additions. I created some dummy draft PRs that just add stats-chi2test
and run the example code from the README to isolate the increases.
@stdlib/stats-chi2tests
# package.json
"@stdlib/stats-chi2test": "^0.0.8",
# import
import chi2test from '@stdlib/stats-chi2test';
This adds 596.1KB
to our bundle size. (https://github.com/elastic/kibana/pull/162294)
@stdlib/esm
# package.json
"@stdlib/esm": "^0.0.3",
# import
import chi2test from '@stdlib/esm/stats/chi2test';
This adds 120.9KB
to our bundle size. (https://github.com/elastic/kibana/pull/162295)
This is in line with a custom bundle I created using rollup which ends up with ~128KB. (https://github.com/walterra/stdlibjs-esm/pull/1/files)
So, for us it looks acceptable with the esm
package regarding the bundle size. The only problem with that is that this repo/npm package is based off older code which uses eval()
which we must not use. In more recent releases this has been refactored away.
Edit/Addendum: One thing we can't tell yet from the testing: Is there a chance that code was refactored in newer versions in a way that's causing this increase in bundle size?
@walterra Yes, the code has been refactored, particularly among dependencies, such as ndarray
.
Just as a heads up, we are working on this.
A few comments:
stdlib
bundler is largely due to no minification. I reproduced the bundle locally and inspected its contents. Without minification, all license headers and internal source code documentation is present. This causes the bundle to balloon to 1.3MB. We will update the README and other docs to demonstrate how to generate a "production" bundle for vendoring using stdlib
tooling.browserify
plugin browser-pack-flat
, you can achieve much the same things as rollup and co and using ES Modules. I was able to generate a bundle for @stdlib/stats/chi2test
which is around 137 KB (including a Buffer
browser polyfill).To reproduce the build, you can do the following:
Clone the stdlib
repository.
Navigate to the cloned directory.
Install development dependencies.
$ make install-node-modules
Make a small modification to ./lib/node_modules/@stdlib/_tools/bundle/pkg-list/lib/main.js
starting a L124
bopts = {
'externalRequireName': opts.requireName,
'require': pkgs,
'paths': ( opts.paths ) ? opts.paths : browserifyPaths(),
'debug': true
}
add debug: true
in order to generate an inline source map.
Create an output directory.
$ mkdir ./tmp
Generate a "flat" bundle.
$ NODE_PATH=./lib/node_modules \
node ./lib/node_modules/@stdlib/_tools/bundle/pkg-list/bin/cli @stdlib/stats/chi2test \
--namespace none \
--plugin browser-pack-flat \
| ./node_modules/.bin/exorcist ./tmp/bundle.js.map > ./tmp/bundle.js
To support vendoring, add an export statement so that the package can be consumed by downstream packages.
$ sed 's/\/\/# sourceMappingURL=/module.exports=require("@stdlib\/stats\/chi2test");\n\/\/# sourceMappingURL=/' ./tmp/bundle.js > ./tmp/bundle.js.tmp
$ rm -f ./tmp/bundle.js
$ mv ./tmp/bundle.js.tmp ./tmp/bundle.js
Generate a minified bundle.
$ ./node_modules/.bin/uglifyjs -c -m \
--in-source-map ./tmp/bundle.js.map \
--source-map "content='./tmp/bundle.js.map',url='bundle.min.js.map'" \
-o ./tmp/bundle.min.js \
./tmp/bundle.js
Inspect the bundle with source-map-explorer
.
$ ./node_modules/.bin/source-map-explorer ./tmp/bundle.min.js \
--exclude-source-map \
--no-border-checks
This should generate something like the following:
Note, as you may already be aware, there are some nuances when attempting to bundle an already bundled file, but at a minimum you can verify that the above bundle works in Node.js by adding the following script to the ./tmp
folder containing generated bundles
import chi2test from './bundle.min.js';
var x = [
[ 20, 30 ],
[ 30, 20 ]
];
var out = chi2test( x );
console.log( out.print() );
and running via the command-line.
In short, I think there is a path forward here which doesn't require updating the esm
repository as a shorter term workaround.
"tree-shaking" is not really necessary when using individual stdlib
packages, by design, as functionality is localized to individual packages and consumed on an as-needed basis. And the use of ES Modules doesn't really have much of an impact on bundle sizes because of how the codebase is organized (to create an ES Module from a source file is largely a matter of rewriting require
statements to import
statements).
What does matter is how bundling is done. In particular, making sure license headers are minified, etc. Hopefully, the above sequence demonstrates that.
Longer term, we do need to provide an ES Module offering in order to accommodate bundlers such as rollup which have poor support for CommonJS. We've thus far avoided moving in this direction, as dual packages are still a headache to maintain in Node.js-land, we maintain strong backward compatibility guarantees, and custom loaders are still experimental.
Thanks for being on top of this so quickly, this looks promising! Completely agree that the bundle size isn't really related to whether it's CommonJS or esm, it was just what's brought us further in our experiments quicker without knowing too much about stdlib's whole setup. Looking forward to see further improvements, thanks!
Quick update: did some refactoring over the weekend, and we were able to reduce the bundle size a bit further (~102KB).
As a consequence of refactoring, there should also be a noticeable performance boost.
Below is an updated source map based on a bundle generated using the sequence documented above.
@walterra We have released a new patch release (v0.0.9
) for the standalone stats-chi2test package, which incorporates the refactoring.
Thank you so much for the update, we'll try to pick this up!
@walterra Let us know if there are other stdlib packages (and/or feature requests) that we can help prioritize for your use case.
Thanks for the ping! So here's where we're at now with the different packages we evaluated on our builds:
package | version | size | PR | Notes |
---|---|---|---|---|
@stdlib/stats-chi2test | 0.0.8 | +596.1KB | #162294 | |
@stdlib/stats-chi2test | 0.0.9 | +391.6KB | #162925 | |
custom bundle @stdlib/esm | 0.0.3 | +135.8KB | #162298 | old version using eval() |
@stdlib/esm | 0.0.3 | +120.9KB | #162295 | old version using eval() |
Version 0.0.9
is a great improvement but unfortunately still quite large compared to the earlier experiments with the esm package.
Here's our yarn.lock
file after we just add "@stdlib/stats-chi2test": "^0.0.9",
to our package.json
: https://github.com/elastic/kibana/pull/162925/files#diff-51e4f558fae534656963876761c95b83b6ef5da5103c4adef6768219ed76c2de
I know too little about stdlibjs
to be able to tell if all these dependencies are really necessary to get chi2test
to work or if its direct dependencies may need the same treatment you gave the package itself to reduce the bundle size.
We're a bit caught up with other stuff for the moment, but will continue to experiment with stdlibjs
and it would be great if we can pick it up at some point!
Are there any plans to update the
stdlib-js/esm
repo/package? The repo (https://github.com/stdlib-js/esm) and npm package (https://www.npmjs.com/package/@stdlib/esm) are atv0.0.3
and use some older code inndarray
relying oneval()
which blocks us from using it.We're evaluating using
stdlibjs
and it looks really promising. Unfortunately the standard package sizes are quite large for client side code. Some experiments with theesm
variant androllup
showed there's quite some potential to bring the bundle sizes down using that approach. Ifstdlib-js/esm
was up to date and published tonpm
we could make direct use of it without the need of a fork and/or custom bundling.Is that something we could help out/contribute potentially?