indexeddbshim / IndexedDBShim

A polyfill for IndexedDB using WebSql
Other
968 stars 191 forks source link

Inline unicode regex and remove unicode-10.0.0 package (saves ~146MB from the project size) #323

Closed aral closed 6 years ago

aral commented 6 years ago

I was testing Dexie.js in Node this week (background/use case) when I noticed the size of my working directory shoot up by ~180MB following the installation of IndexedDBShim. It looks like ~146MB of that size is the unicode-10.0.0 module.

As far as I can see, the only bit of that module that is being used is a single regular expression in the file https://github.com/mathiasbynens/unicode-10.0.0/edit/master/Binary_Property/Expands_On_NFD/regex.js, which is 2KB.

In this pull request, I remove the unicode-10.0.0 dependency and inline the regular expression in util.js to shave ~146MB from the dependencies (not to mention lots of bandwidth saved for developers everywhere) ;)

All Node tests are passing but I cannot verify that this does not have side effects in browsers as I get ~23 errors on master with the mocha tests in browser (testing in latest Safari).

As far as I can see, there should be no side effects.

brettz9 commented 6 years ago

Thanks--that'd be a helpful change, and you have some good questions which I hope to get to shortly after finishing up on some other work.

I think I'd prefer here though, that we keep the dependency, but change it to a dev dependency, and have a copy routine which grabs the file we need (and then import that file copy). I feel that process more clearly outlines where we are getting the data, and allows us to update if there are fixes, and to swap the version when Unicode updates.

aral commented 6 years ago

@brettz9 Changing it to a dev dependency would remove the bandwidth burden from those using the library (for the vast majority of the use case), so I’d be happy with that for the reasons you mentioned.

Contributors would still have ~146MB more to download than before, however.

I hear your point about having the full package helping to (a) better reflect intent (b) make updates easier.

I wonder if those two goals might not also be met by:

  1. A good comment in the source:
// This regular expression is from the Unicode-10.0.0 package
// and used for XXX purpose.
  1. Manual updates of the line based on Unicode version updates (since 2012, this has been one update a year).

There’s just something about including a 146MB module for 2KB of code that I can’t seem to stomach :)

Your choice, of course. And, for my specific use case, I’m happy with the Unicode package being made a dev dependency (although I’d be happier if we shaved that 146MB off for everyone, including contributors) :)

Hope you’ve had a great week and here’s wishing you a lovely weekend :)

brettz9 commented 6 years ago

Hi Aral,

Apologies for the delayed reply, and I should say it is particularly encouraging when someone is looking into the testing side.

While "We're already burdening users with XYZ" is not usually a good excuse, if someone is recursively cloning web-platform-tests--as all testers really should because those are the only tests that are still failing, there is indeed already a lot for testers to download.

Besides this, I went ahead with the devDep inclusion for the additional reason that it minimizes the verification/trust needed. If someone trusts npm and has verified or has trust in the Unicode package (and runs the new copy script), there is no need for them to trust whether our regular expression is a faithful copy.

Btw, off topic... With the changes I made by having an interim include file, we are now no longer hard-coding the version of Unicode that we are including, so in the future, we should only need to update package.json and the copy script rather than messing with the source.

Thanks for the suggestions, the helpful reports, and look forward to any further contributions!

aral commented 6 years ago

@brettz9 Hey Brett, thanks for the update. I understand your reasons – as I mentioned earlier, as long as end-users do not have to download the extra 146MB, I’m happy. Your solution and reason are logical.

Is this change in a local branch at the moment? I can’t seem to find it here.

brettz9 commented 6 years ago

I had been missing a 3.7.0 tag now added, but it had already been published to master and to npm.

The regex file is copied into src (see https://github.com/axemclion/IndexedDBShim/blob/master/src/unicode-regex.js ) via the Grunt "copy" task at https://github.com/axemclion/IndexedDBShim/blob/master/Gruntfile.js#L414-L418 . That copy task can be invoked from npm via the "copy" script: https://github.com/axemclion/IndexedDBShim/blob/master/package.json#L27 .