uwescience / myria

Myria is a scalable Analytics-as-a-Service platform based on relational algebra.
myria.cs.washington.edu
Other
113 stars 46 forks source link

Allowed types for registering Python functions are a different set from those allowed by RACO #903

Open orzikhd opened 7 years ago

orzikhd commented 7 years ago

I ran into this issue while trying to make a UDA work in MyriaL in a jupyter notebook. RACO expects either LONG_TYPE or DOUBLE_TYPE for numerical function results, but when registering a function it is possible to register it as FLOAT_TYPE and then RACO fails to understand the type, leading to TypeError: 'NoneType' object has no attribute '__getitem__'

It seems like there should be a warning when registering the function about the type not being valid or a better error output when the type checking fails while calling the UDF/UDA.

Here is an example that is based on the argMax TwitterK example from the myria docs:

from raco.types import LONG_TYPE
def pickBasedOnValue(tuplList):
    for tupl in tuplList:
        value1 = tupl[0]
        arg1 = tupl[1]
        value2 = tupl[2]
        arg2 = tupl[3]
        if (value1 >= value2):
            return arg1
        else:
            return arg2

MyriaPythonFunction(pickBasedOnValue, LONG_TYPE).register()
from raco.types import FLOAT_TYPE
def maxValue(tuplList):
    for tupl in tuplList:
        value1 = tupl[0]
        value2 = tupl[1]
        if (value1 >= value2):
            return value1
        else:
            return value2

MyriaPythonFunction(maxValue, FLOAT_TYPE).register() 
%%query
uda argMaxAndMax(arg, val) {
    [-1 as argAcc, -1.0 as valAcc];

    [pickBasedOnValue(val, arg, valAcc, argAcc),
     maxValue(val, valAcc)];

    [argAcc, valAcc];
};
t = scan(cube300);
s = select argMaxAndMax(iOrder, vx) from t;
store(s, maxVX);

This throws the NoneType error, but using DOUBLE_TYPE instead for maxValue works.

senderista commented 7 years ago

This looks like a RACO issue to me.