browserslist / caniuse-lite

A smaller version of caniuse-db, with only the essentials!
Creative Commons Attribution 4.0 International
548 stars 77 forks source link

Inefficient Storage #6

Open bdkjones opened 7 years ago

bdkjones commented 7 years ago

You've indicated that one of the goals for caniuse-lite is reduced disk space usage. I'm strongly in favor of that because I bundle many node modules into an app that I ship.

However, because caniuse-lite is broken up into TONS of tiny files, the module is actually incredibly inefficient when stored on disk. As the screenshot below shows, there's 944KB of actual data in the caniuse-lite module, but that's broken up across hundreds of files that are just a few bytes each. The trouble is that the smallest block the operating system can assign on disk is 4KB, so each of these tiny files actually consumes 4,000 bytes on disk.

In short, 67% of the space that caniuse-lite consumes on disk is simply waste because the operating system can't assign blocks small enough for the tiny files you've created. It would be nice to consolidate this. You're hardly the only offender; this is a super common problem with JavaScript in the node community. Everyone's sort of taken modularization to INCREDIBLE extremes.

screen shot 2017-06-11 at 15 51 40
ai commented 7 years ago

I like open source because of this issues :). Great investigation.

bdkjones commented 7 years ago

You've definitely improved on caniuse-db which is 7.8MB of data occupying 9.2MB on disk. But the percentage of wasted space in that project is just 15%. (1 - 7.8/9.2). It would be cool to see more node developers optimize this figure.

ben-eb commented 7 years ago

Do you have any ideas on how we could do this?

bdkjones commented 7 years ago

Well, it would involve refactoring the project so that there's fewer files. Looks like you're using ES6, so one approach would be named exports. It would then be possible to put multiple definitions into fewer files rather than have one file for each.

bdkjones commented 7 years ago

It also looks like each of these modules just returns a JSON object. A simple approach would be to have, say, a single "regions" module that returns a larger JSON object where the top-level keys are the current filenames of the separate modules (AD, AE, AF, etc). That would consolidate hundreds of files into one, etc.

ai commented 7 years ago

Since ES6 exports and tree shaking doesn’t work widely, it will increase webpack build size for Autoprefixer. I think our goals:

So we need to find some compromise.

ai commented 7 years ago

But idea of big bundles looks better :)

piranna commented 3 years ago

A better aproach would be to just use JSON files to store the info.

ai commented 3 years ago

PR without breaking changes is welcome

jdfm commented 1 year ago

Is this really something that caniuse-lite should be tackling?

I could understand if the difference was something like MBs to GBs, but, KBs to MBs isn't such an unacceptable difference in an age of cheap storage with TBs of capacity, IMHO. Further, it's not this project's fault that node has no way of including archived/compressed modules (closed discussion in nodejs about it) that would be able to bundle together code into a single file to limit the file system overheads that are being pointed to in this issue.

As an alternative to npm, there's pnpm that is meant to help deal with some file system overheads via deduplication of files in node_module packages.

Personally speaking I look at this issue as simply an acceptable tradeoff to using caniuse-lite as a development tool.