AlaSQL / alasql

AlaSQL.js - JavaScript SQL database for browser and Node.js. Handles both traditional relational tables and nested JSON data (NoSQL). Export, store, and import data from localStorage, IndexedDB, or Excel.
http://alasql.org
MIT License
7.02k stars 655 forks source link

Minify library size #157

Open agershun opened 9 years ago

agershun commented 9 years ago

Current size of the library slowly grows, because we add new features. Probably, it is a time to compress the code with more radical tools than UglifyJS:

agershun commented 9 years ago

Latest researches:

So,the library size can be reduced up to 310kb.

agershun commented 9 years ago

Done:

Starting point (after uglifyJS) = 422Kb

Current size of the library: 322kb

mathiasrw commented 9 years ago

Anything new on getting deper into the closure flags?

Would be good for the size of the lib to get to use the force of ADVANCED_OPTIMIZATIONS https://developers.google.com/closure/compiler/docs/compilation_levels?csw=1

Are we using closure today? When I run i on 0.1.7 I get the following errors

JSC_TRAILING_COMMA: Parse error. IE8 (and below) will parse trailing commas in array and object literals incorrectly. If you are targeting newer versions of JS, set the appropriate language_in option. at line 6997 character 25 in alasql.js
            srcwherefn: returnTrue,
                         ^
JSC_PARSE_ERROR: Parse error. String continuations are not supported in this language mode. at line 7948 character 92 in alasql.js
...ypeof '+colexp+'!="undefined" && (!g[\'$$_VALUES_'+colas+'\']['+colexp+'])) \
                                                                          ^
JSC_TRAILING_COMMA: Parse error. IE8 (and below) will parse trailing commas in array and object literals incorrectly. If you are targeting newer versions of JS, set the appropriate language_in option. at line 8151 character 26 in alasql.js
                            dbenum: tcol.dbenum,
                          ^
JSC_TRAILING_COMMA: Parse error. IE8 (and below) will parse trailing commas in array and object literals incorrectly. If you are targeting newer versions of JS, set the appropriate language_in option. at line 8158 character 38 in alasql.js
                            columnid:col.as || col.columnid, 
                                      ^
JSC_TRAILING_COMMA: Parse error. IE8 (and below) will parse trailing commas in array and object literals incorrectly. If you are targeting newer versions of JS, set the appropriate language_in option. at line 8172 character 38 in alasql.js
                            columnid:col.as || col.columnid, 
                                      ^
JSC_TRAILING_COMMA: Parse error. IE8 (and below) will parse trailing commas in array and object literals incorrectly. If you are targeting newer versions of JS, set the appropriate language_in option. at line 8205 character 56 in alasql.js
                            columnid:col.as || col.columnid || col.toString(), 
                                                        ^
JSC_TRAILING_COMMA: Parse error. IE8 (and below) will parse trailing commas in array and object literals incorrectly. If you are targeting newer versions of JS, set the appropriate language_in option. at line 8228 character 56 in alasql.js
                            columnid:col.as || col.columnid || col.toString(), 
                                                        ^
JSC_PARSE_ERROR: Parse error. String continuations are not supported in this language mode. at line 13754 character 10 in alasql.js
        var s = '<html xmlns:o="urn:schemas-microsoft-com:office:office" xmlns:x="u...
          ^
JSC_PARSE_ERROR: Parse error. String continuations are not supported in this language mode. at line 13760 character 53 in alasql.js
        s+=' <x:ExcelWorksheet><x:Name>' + sheet.sheetid + '</x:Name><x:WorksheetOp...
                                                     ^
JSC_PARSE_ERROR: Parse error. String continuations are not supported in this language mode. at line 14059 character 11 in alasql.js
        var s1 = '<?xml version="1.0"?> \
           ^
JSC_PARSE_ERROR: Parse error. String continuations are not supported in this language mode. at line 14165 character 40 in alasql.js
            s3 +='<Worksheet ss:Name="'+sheetid+'"> \
                                        ^
JSC_PARSE_ERROR: Parse error. String continuations are not supported in this language mode. at line 14168 character 8 in alasql.js
                    +'" x:FullColumns="1" \
        ^
JSC_PARSE_ERROR: Parse error. String continuations are not supported in this language mode. at line 16679 character 7 in alasql.js
            js+="');\
       ^
agershun commented 9 years ago

Due some problems I temporary turn off Closure Compiler until we finish with Codex and basic documentations.

Let we spend this week with manuals and documentation (I think we need to transfer all tests examples to documentation at least), and then turn to minification and Cordova goals.

2015-06-01 15:12 GMT+03:00 Mathias Rangel Wulff notifications@github.com:

Anything new on getting deper into the closure flags?

— Reply to this email directly or view it on GitHub https://github.com/agershun/alasql/issues/157#issuecomment-107419451.

agershun commented 9 years ago

All of them can be solved, it simply reqires time.

Unfortunately, currently we can not use ADVANCED OPTIMIZTAION, because Closure compiler skip one of the parser functions, so we need to investigate this.

2015-06-01 15:12 GMT+03:00 Mathias Rangel Wulff notifications@github.com:

Anything new on getting deper into the closure flags?

— Reply to this email directly or view it on GitHub https://github.com/agershun/alasql/issues/157#issuecomment-107419451.

mathiasrw commented 9 years ago

Sure - its not on the roadmap for now - just want to keep the fire going :)

Input for 300K

If we get more extream on size we can try to look into the following - but we must do tests to make sure it does not affect speed


(thinking here if its an idea to have a production version and a dev version where the dev gives nice errors from correct line and prod only provides an error code)

(also having excel as module would remove the template strings from the core)

agershun commented 9 years ago

Cool!

You can look at the utils\ directory. It already have some prototype of size optimization utils for parser and other words. We can come back to them soon.

mathiasrw commented 9 years ago

If we rename toJavaScript to toJS we gain 1kb in the min version.

agershun commented 9 years ago

Good! We can call it .JS?

Отправлено с iPhone

21 июня 2015 г., в 1:54, Mathias Rangel Wulff notifications@github.com написал(а):

If we rename toJavaScript to toJS we gain 1kb in the min version.

— Reply to this email directly or view it on GitHub.

mathiasrw commented 9 years ago

sure. Ill rename .toJavaScript to .JS in src files

agershun commented 9 years ago

Ok

Отправлено с iPhone

21 июня 2015 г., в 12:26, Mathias Rangel Wulff notifications@github.com написал(а):

sure. Ill rename .toJavaScript to .JS in src files

— Reply to this email directly or view it on GitHub.

mathiasrw commented 9 years ago

Renaming to .JS kept giving errors and I could not identify where the issue came from - so its .toJS until we look at it again. Please pull changes from develop.

mathiasrw commented 9 years ago

Minor changes in the code and with language set to ECMASCRIPT5 I got closure to compile in advanced mode. Its 307.75 kB ( .min version is at the moment 437.271 kB)

Will run the tests on the code to verify if things still work

I have a feeling we need more work before tests will be OK.

If we start using closure its important with correct use of comments to document the functions as Closure uses this to typecheck https://developers.google.com/closure/compiler/docs/js-for-compiler?csw=1#types

agershun commented 9 years ago

Unfortunately, this flag kills the parser. We have the option to change something in the parser code to prevent this overoptimization.

mathiasrw commented 9 years ago

We could also have the parser as external while closure parsing and bundling it up afterwards - so we have kind of a stub in the code and then replace it after

noid2 commented 4 years ago

Is there a way to make the full library "tree shaking"? Example: I am using the library only to analyse complex JSON data in the client and I have to load the full 440KB of script. It would be awesome if I can extract only the code needed for a this task. This makes sence since I will not import or export files or use any other "db engine" in the background

mathiasrw commented 4 years ago

@noid2 Nope. It wont work work treeshare because one third of the code is the parser of SQL and the other third builds costum functions that are eval'ed to use the last third of the code.

Its a bit of a bummer...

mathiasrw commented 4 years ago

@agershun have you ever considered building the functions into strings and exporting the strings to the executable functions - so people could use only the compiled version of the functions they need?

mathiasrw commented 4 years ago

The parser is about half the library.

It could be worth looking into using https://sap.github.io/chevrotain/

The grammar is is another format so might be a risky job to swap over. Its about 8x as fast as the JISON parser - Would be interesting to see if this actually is worth it as the parsing of sql is often not the thing that takes time.

Nice sandpit: https://sap.github.io/chevrotain/playground/

noid2 commented 4 years ago

I spent some time studying the source code and my conclusion is that its about time to start building a new major version from scratch. If you are willing to take this as an option, I will be more than happy to share my findings and suggestions for this approach. Obviously this mean that I will be an active contributor to develop the next version.

ps. I come from data science (10+years) and I am not a professional developer but I have fallen in love with JavaScript for the past 4 years :)

mathiasrw commented 4 years ago

Hi @noid2 You are welcome to come and join.

Taking the lead on next version would be very helpful. Some work have been put into converting the current base to modules - the first step to get the code modernized. If you feel an approach from scratch is helpful we are open to that, but its a lot of work.

As I see it the best aproach would be to make a tiny version only with select and simple functions that can can have more functionality added to it. The nasty part is the paser. almost half of the size - and most people dont use much of that any more.

Another aproach would be to handle alasql functions at build time - a bit like svelte - where the actual code to do the magic is generated so no parser or alasql is needed in production letting alasql basically become a codegenerator. (it is a code generator now under the hood)

What approach have you been considering?

noid2 commented 4 years ago

Hi @mathiasrw

Its nice that you agree on a new version. The current version is really awesome and packed with features. Of course there are some bugs but it gets the job done. What I think is truly unique is the ability to use query JSON data with SQL and use SQL, AlaSQL and JS functions while selecting. That's the best of SQL and JS words combined. PostgreSQL has similar functionality but its not as natural as AlaSQL.

My idea is to have have a core high performance library that solves the core problem of querying data using SQL in the JS world. This can be achieved by first making one "standard" and following it. This standard must define:

  1. Data structure [eg. data input/output only as JSON ( array of objects)]. If the user needs to query other data structures is should be first transformed into the standard structure using other functions which are not part of the core library. Same goes for output.
  2. SQL features Same as you mentioned in your the approach, The core library should only select data with simple functions and more functionality can be added when needed.
  3. Adapters Connection to data sources and data exports. Single file that can be imported separately.
  4. JS Ecosystem Includes Programming Paradigms, ECMAScript support, dependencies, Compatibility, Naming etc.

I know this looks overwhelming, but its not so much compare to the efforts put in maintaining the current version. Also, when I say "from scratch" it does not mean to write again every bit of code, it means to use as much as possible from existing code.

I suggest that the starting point would be to strip down the current version and remove everything except of the select feature. I have already tried this but the parser (its big, obviously) but for me is very confusing. I do not understand much of it.

Of course there are many other details to discuss.

mathiasrw commented 4 years ago

@noid2 If you feel like copying your comment (or making a new one) into #1240 we can continue the dialogue there...