alpheios-project / alpheios-core

Alpheios Core Javascript Packages and Libraries
15 stars 2 forks source link

Optimize the bundle size #48

Open kirlat opened 5 years ago

kirlat commented 5 years ago

Inflection tables is the largest library that bundles into the Alpheios components. Inflection tables lib is over 1MB minimzied, and that is about half of an overall size of the components lib itself. An optimization of inflection tables size may help to reduce the components' distributive size significantly.

The majority of space of inflection tables lib is occupied by the language data files that are in CSV and JSON formats. Another large contribution is the inclusion of a Papaparse and its dependencies such as readable-stream and node-lib-browser. Please see an image below (you can click on the picture to see it full size) for the structural composition of inflection tables library as it is now. image

On my opinion, the possible ways to reduce the size of the inflection tables library are (we can use a combination of approaches listed above):

  1. Replace Papaparse with something more compact. The problem with Papaparse is that it's a very powerful library. It can not only decode files, but also encode them (which we don't need for our task). It can also read and write files to the file system. We do not need that either. For the latter functionality, it relies on several node.js related packages such as readable-stream and node-lib-browser. As we do not operate with CSV files at all (CSV files are imported into the bundle as strings by webpack), dependencies included by Papaparse for reading and writing occupy valuable space within the bundle. All we need for inflection tables is the parsing of CSV strings. Maybe something more lightweight than Papaparse would be a better choice for us.
  2. Save data in more space-efficient formats. While CSV is pretty space good at that, JSON can be not so (especially if there are repeating filed names). What if we try to load data in formats that take less space? Maybe something like Protocol Buffers of BSON will do? Ubiquity of CSV and JSON makes them easier to edit as so many tools support those formats. But what if we, keeping the source data in CSV and JSON will add a transpiler that will turn the source files into a more compact format? So after the source data is edited, a transpiler will be run and that will pipe source data into something more compact. Those smaller files will be the ones to be bundled with the library.
  3. Use dynamic loading of data. Some users may need data for specific language only (Latin or Greek). For them, having language data for all languages in a bundle will be a waste of space, and having Views for the unused language will not be needed too. What if we put data files to a CDN and load them dynamically upon the end client (webextension or embed-lib) initialization? We can add an initialization option to the UIController that will specify what language data will be needed for the user. With this information, UIController can load (or instruct inflection tables to load) necessary language data in the background, upon its initialization. If done right, this can be almost invisible to the user.

What do you think about this issue and suggested ways to solve it? Do you have any other ideas about how to reduce the bundle size of the inflection tables?

balmas commented 5 years ago

thanks for making these suggestions. Agree we should consider implementing one or more of these, but probably not until the next release.