AlaSQL / alasql

AlaSQL.js - JavaScript SQL database for browser and Node.js. Handles both traditional relational tables and nested JSON data (NoSQL). Export, store, and import data from localStorage, IndexedDB, or Excel.
http://alasql.org
MIT License
7.01k stars 651 forks source link

MAX() on strings #609

Open mathiasrw opened 8 years ago

mathiasrw commented 8 years ago

I have 3 questions.

Background: At the moment MAX (and MIN but to make it easy to read I only say MAX) only works on numbers (its implemented with Math.max)

Been playing a bit with it and expanded stdlib.MAX from 55functions.js only to realise that this only works when you hardcode the values. Ups. The check needs to be in the generated code and not here - so would have to replace the existing

return 'Math.max('+Array.prototype.join.call(arguments, ',')+')'

with

return 'someAlasqlUtil.awesomeMaxFunctionThatCanHandleStrings('+Array.prototype.join.call(arguments, ',')+')'

where awesomeMaxFunctionThatCanHandleStrings would look something like

    // If all arguments are not numbers treat all as string. 
    for (var i = 0; i < arguments.length; i++) {
        if(arguments[i] != +arguments[i]){
            return Array.prototype.sort.call(arguments)[arguments.length-1] // I think I read that pop() or switch() on arguments is bad. Todo: theck if .sort() is also bad. 
        }
    };

    return Math.max.apply(null, arguments);

Questions

  1. We loose some speed if the function needs to spend time finding out if it will be comparing numbres or strings. Any inputs to weather its best to expand the existing MAX function or to make a new function falled (for example) MAX_STRING?
  2. To try to understand the code: if the MAX code was to be expanded where would the best place be to have the code from someAlasqlUtil.awesomeMaxFunctionThatCanHandleStrings ?
  3. To try to understand the code: The solution is not hard to implement with a user defined aggregation function - but if the MAX_STRING code was to be implemented in the core - where would the correct place be?
alasql.aggr.MAX_STRING = function(value,accumulator,stage){
    if(1===stage){
        return value;
    }

    if(2===stage && accumulator<value){
        return value;
    }

    return accumulator;
};
agershun commented 8 years ago

We can change internal aggregator implementation in 423groupby.js`` to:

                } else if(col.aggregatorid === 'MIN') { 
//                  return pre+'g[\''+colas+'\']=Math.min(g[\''+colas+'\'],'+colexp+');'+post; 
                    return pre+'g[\''+colas+'\']='+und(colexp,'(g[\''+colas+'\']<y?g[\''+colas+'\'],y)'+)+';'+post; 

                } else if(col.aggregatorid === 'MAX') { 
//                  return pre+'g[\''+colas+'\']=Math.max(g[\''+colas+'\'],'+colexp+');'+post; 
                    return pre+'g[\''+colas+'\']='+und(colexp,'(g[\''+colas+'\']>y?g[\''+colas+'\'],y)'+)+';'+post; 
agershun commented 8 years ago

In the above example I changed internal aggregator `MAX(). We also can define this aggregator somewhere at 55functions.ja:

alasql.aggr.MAX_STRING = function(value,accumulator,stage){
    if(1===stage){
        return value;
    }
    if(2===stage && accumulator<value){
        return value;
    }
    return accumulator;
};

... thinking...

The important note: we have MAX() as aggregator and MAX() as function. Now aggregator is hardcoded and the function is a part of alasql.stdlib (compiled functions).

If we want to change the aggregator we need to remove all MAX lines at 423groups.js and add custom aggregator alasql.aggr.MAX,

To change the function we can remove it from alasql.stdlib.MAX and simply add the normal function without compilation to alasql.stdfn.MAX. In this case we do not need to dance with compilation, and AlaSQL will use just simple regular JS function.

mathiasrw commented 8 years ago

we have MAX() as aggregator and MAX() as function

Interesting. So MAX(hight) ... group by xyz is a different MAX than MAX(test1, test2, test3)

Do you have any inputs regarding peformance regarding to make a new name or expand the existing functionality?