KxSystems / pyq

PyQ — Python for kdb+
http://code.kx.com/q/interfaces
Apache License 2.0
190 stars 49 forks source link

Send data to python from q function scope? #7

Closed trias702 closed 7 years ago

trias702 commented 7 years ago

Hello,

Is it at all possible to set/get a variable in python from q, when running in a q function, meaning the data has no global q scope? Perhaps using e0 inside p.k? This would be the identical of using Rset/Rget in the Kx rserver package to communicate with R from KDB. So what I want to do is this:

q)myFunc:{[x;y]
         set_variable_in_python[x;"py_var_x"];
         set_variable_in_python[y;"py_var_y"];
         // check vars are in Python
        .p.e["print(py_var_x)"];.p.e["print(py_var_y)"];
        // run some crazy python func
        .p.e["res = some_py_func(py_var_x,py_var_y)"];
        // fetch res from python
        kdb_res:get_var_from_python["res"];
        :kdb_res;
};

Is anything like this possible using PyQ? I know that if I use global scope, it can work by using pyq.q(), even if called from q, like this:

q)myFunc:{[x]
         global_x::x;
         .p.e["q.global_res = some_py_func(q('global_x'))"];
         :global_res;
};

However this is not efficient for a number of reasons, chiefly in that it prohibits using myFunc with peach.

Basically, is there something like the equivalent of pyq.q() but from the KDB side, in order to get and set variables into python? I want to do everything which pyq allows one to do from Python->KDB, but going the other way, KDB->python. Not sure if pyq can do this or not?

abalkin commented 7 years ago

While you cannot change binding of local python variables from q, you can modify mutable local variables as shown in the following snippet:

from pyq import q

def f():
    x = {}
    def set(name, val):
        x[str(name)] = val
        return q('::')
    q('{x(`a;42)}', set)
    print(x['a'])

f()

If you execute this code, it should print "42".

However, I would advise you against using tricks like this in peach. Since CPython has a global interpreter lock, you can not achieve true parallelism and currently pyq cannot deal properly with q objects created in the other than main threads.

trias702 commented 7 years ago

Thank you kindly for your help with this question. Unfortunately, I'm afraid I don't understand your reply. My original question involved writing only q code, from a running q instance, utilising PyQ's p.k, yet your answer example is from Python talking to q via pyq.q(). I need example code going the opposite way, e.g. a function executed in q which sets variables in python.

My goal is to create python variables from inside q functions, using only a running q process, meaning no call to the pyq executable or "q python.q". Basically something like this:

<start vanilla q from unix command line, no parameters passed>
q)p)import pyq
q){[] somehow set a variable in python from inside q function }[]
q)p)print(variable you created above)
q)\\

Is this even possible to do in the current pyq?

abalkin commented 7 years ago

Would this do what you want?

First, create a file set.p with the following code:

from pyq import q
def set(name, value):
    globals()[str(name)] = value
    return q('::')
q.pyset = set

Now, in q

q)\l set.p
q){pyset(`a;42)}[]
q)p)print(a)
42
abalkin commented 7 years ago

Going back to your original example

q)myFunc:{[x;y]
         set_variable_in_python[x;"py_var_x"];
         set_variable_in_python[y;"py_var_y"];
         // check vars are in Python
        .p.e["print(py_var_x)"];.p.e["print(py_var_y)"];
        // run some crazy python func
        .p.e["res = some_py_func(py_var_x,py_var_y)"];
        // fetch res from python
        kdb_res:get_var_from_python["res"];
        :kdb_res;}

You can define set_variable_in_python in terms of pyset:

q)set_variable_in_python:{pyset(y;x)}

Also, you can add

from pyq import K
def get(name):
      return K(global()[str(name)])
q.pyget = get

to set.p and define get_var_from_python as

q)get_var_from_python:{pyget enlist x}

However, if all you want is to pass x and y to some_py_func, I would recommend the following approach:

q)p)from some_module import some_py_func
q)p)q.some_py_func = some_py_func
q){res:some_py_func(x;y);:res}[]

Please be aware of the somewhat unusual calling convention for the python functions exported to q: they take a single argument that is a list of arguments to be passed to python.

trias702 commented 7 years ago

Apologies in advance for a long post.

Thank you most kindly for the wonderful examples, I have done everything as you have stated, and it works quite well, although there seems to be a bug with your last example (number 3 below). I also have two questions (1 and 2):

1) When working in python with "from pyq import q", you can set variables inside q using:

q.some_variable = 4

and this is the same as doing

q)some_variable:4

But what is the syntax (from python) if you want to set a namespace variable inside q? So, how would I duplicate the following q code from python:

q).test.myvar:768768
q)show .test.myvar
768768

Is there a way to do this from python using pyq.q()? So basically, I want to do what you originally said:

q)p)q.some_py_func = some_py_func

But rather than have it called "some_py_func" in `. in q, i want to call it ".test.some_py_func"

2) In your "def get(name)" function, I notice you wrap the returned value from the globals() dictionary inside K(), which converts the python type to a KDB type. Does this mean that any python function which you want to call from q needs to have its return value wrapped in K()? Because if so, that means you cannot call some_py_func directly from q, since its return value is not wrapped in K()? Additionally, any variables you send to python from q using pyset becomes a pyq.K type, and not python native:

q)\l set.p
q)pyset[(`aa;7777)];
q)p)print(aa)
7777
q)p)print(type(aa))
<class 'pyq.K'>
q)p)print(type(aa+1000))
<class 'pyq.K'>
q)p)print(aa+1000)
8777
q)p)xx = 8777
q)p)print(type(xx))
<class 'int'>

3) Finally, I tried doing the following:

q)p)from some_module import some_py_func
q)p)q.some_py_func = some_py_func
q){res:some_py_func(x;y);:res}[]

Into set.p, I added the following python code:

import math
def myexp(x):
    return math.exp(x)

Then I tried the following from q:

q)\l set.p
q)p)print(myexp(2))
7.38905609893065
q)p)q.test = myexp
q)test
code[130026840;132072176]
q)test[2]
'type error
q)test[2i]
'type error
q)test[2f]
'type error

Even if I did the following, it doesn't work:

import math
def myexp(x):
    return K(math.exp(x))

This gives the same 'type error

What am I doing wrong?

My ultimate goal is to be able to call python NumPy/SciPy/Keras functions from q directly, so being able to bind python funcs to q funcs using "q)p)q.py_func = py_func" seems to be an amazing possibility, but it doesn't appear to work, is this because the arguments being passed are not converted from pyq.K types to native python types?

abalkin commented 7 years ago

What am I doing wrong?

Did you see my warning about an unusual calling convention? It is particularly noticeable when you export a one-argument function from python to q.

q)p)import math
q)p)def f(x): return math.exp(x)
q)p)q.f = f
q)f enlist 1
2.718282

The f function in q expects a list, so to pass 1 above we had to enlist it. If you don't – you get the type error:

q)f 1
'type error
abalkin commented 7 years ago

But what is the syntax (from python) if you want to set a namespace variable inside q?

There is no special syntax. You need to use the q.set function:

>>> q.set('.test.x', 42)
k('`.test.x')
>>> q()
q).test.x
42
abalkin commented 7 years ago

Does this mean that any python function which you want to call from q needs to have its return value wrapped in K()?

In the upcoming release we are relaxing this restriction, however the most effective use of python on kdb+ data is to use numpy functions and redirect output to q vectors.

q)p)import numpy
q)data:([]x:5?1f;y:5#0n)
q)data
x         y
-----------
0.3927524
0.5170911
0.5159796
0.4066642
0.1780839
q)p)numpy.log10(q.data.x, out=numpy.asarray(q.data.y))
q)data
x         y
--------------------
0.3927524 -0.4058812
0.5170911 -0.2864329
0.5159796 -0.2873674
0.4066642 -0.3907641
0.1780839 -0.7493754
trias702 commented 7 years ago

Ah yes, very sorry about the enlist mistake, especially since you just wrote to me about it, and I even used in my own get_var_from_python function. Thank you very much for taking the time to help me with these issues, I greatly appreciate it.

So I just tried to replicate your work, and still got the type error, until I wrapped in K():

q)\l set.p
q)p)import math
q)p)def f(x): return math.exp(x)
q)p)q.f = f
q)f enlist 2
'py-type error     // weird??
q)p)def f(x): return K(math.exp(x))   // aha! I need to wrap in K()
q)p)q.f = f
q)f enlist 2
7.389056

I'm still not entirely clear on this K() requirement, it seems to me that I cannot just bind my favourite numpy/scipy functions to q functions using this system:

q)p)q.some_numpy_func = numpy.some_func

And then call the numpy funcs direct from q:

q)result:some_numpy_func[enlist some_val] // will always type error

The only way I can do this, is by creating a python wrapper which casts the return to K():

q)p)def wrap_some_func(x): return K(numpy.some_func(x))
q)p)q.some_numpy_func = wrap_some_func
q)result:some_numpy_func[enlist some_val]      // this will work

Now, if I use my original example, of pyset and pyget, I seem to sidestep this problem:

q)\l set.p
q)pyset(`aa;2);
q)p)print(aa)
2
q)p)import math
q)p)xx = math.exp(aa)
q)pyget enlist `xx
7.389056

The above works because the pyget function includes a K() cast:

def get(name):
    return K(globals()[str(name)])    // this makes all the difference
q.pyget = get

Am I correct that I cannot just bind every useful numpy/scipy function to q without somehow casting the python result in K() before returning the result to q?

Finally, just to make sure I understand your last example, "out=numpy.asarray(q.data.y)", this will only work if you have a global variable already defined in q, before you call the python func, correct? Which means, you cannot do it from inside a q function:

q)p)import numpy
q){[x] x:0n;
    .p.e "numpy.log10(0.5, out=numpy.asarray(q.x))";  // ERROR: variable x does not exist in q global scope
    :x;
};

However, if my understanding is correct, this will work:

q)p)import numpy
q){[x] x::0n;   // make x global scope
    .p.e "numpy.log10(0.5, out=numpy.asarray(q.x))";  // works now
    :x;
};

Am I correct in my understanding of everything?

abalkin commented 7 years ago

I just tried to replicate your work, and still got the type error, until I wrapped in K()

It will work without K() in the new version that we are about to release.

I cannot just bind my favourite numpy/scipy functions

We will consider adding this functionality in a future release.

"out=numpy.asarray(q.data.y)", this will only work if you have a global variable already defined in q, before you call the python func, correct?

If you want to use .p.e or p) syntax, then you are restricted to communication via global variables. This makes it only useful in the interactive session or in small scripts. For any non-trivial program you should wrap numpy methods in a function that can be exported to q. For example, with the following code in log.p,

import numpy
def log(x, y):
    numpy.log(x, out=numpy.asarray(y))    
    return q('::')
q.numpy_log = log

try

q)\l log.p
q)f:{r:count[x]#0n;numpy_log(x;r);r}
q)f 5?1f
-0.9345759 -0.6595362 -0.661688 -0.8997675 -1.725501

As you can see, numpy_log can work with local variables.

vincentleung58 commented 7 years ago

Hi abalkin,

I followed your code

>>> from pyq import q
>>> def f():
>>>    x = {}
>>>   def set(name, val):
>>>      x[str(name)] = val
>>>   q('{x(`a;42)}', set)
>>>   print(x['a'])
>>>
>>> f()
error: py-type error

this doesn't work either

>>> def set(name, value): globals()[str(name)] = value
>>> q.pyset = set
>>> q()
q) {pyset(`a;42)}[]
'py-type error

this, however, works perfectly

>>> import math
>>> def myexp(x): return K(math.exp(x))
>>> q.myexp = myexp
>>> q()
q) myexp enlist 5
148.4132

Just curious if these features belong to a newer version. My version information is

PyQ 3.8.4 Numpy 1.10.4 KDB+ 3.4 (2016.06.14) Python 2.7.11 | Anaconda 2.5.0 (32-bit)

abalkin commented 7 years ago

@vincentleung58 – in the released version python functions exported to q must return K objects. You should add return q('::') to the functions that implicitly return None. I'll edit my examples above to make sure they work with PyQ 3.8.

trias702 commented 7 years ago

Hi Abalkin,

Thank you very much for taking the time to explain everything in a very clear and concise manner, I very much appreciate it! I understand now how to use Python from KDB, from best practices and design patterns to limitations. Thank you also for your very hard work in developing PyQ, as I can see now it is incredibly powerful in what it allows you to do with Python from KDB and vice versa. If only Kx's RServer worked more like PyQ then KDB would be unrivalled in power.

Do you know which upcoming version number of PyQ will drop the requirement that all Python functions which return to KDB need to wrap the return in K()?

trias702 commented 7 years ago

Sorry, one last question, regarding using peach in q with python bound functions, is it safe? I know you commented earlier on this, that the GIL would restrict it, and PyQ can only read from the q main thread, but that was before you introduced your last example to me. Specifically, with your final example to me:

import numpy
def log(x, y):
    numpy.log(x, out=numpy.asarray(y))    
    return q('::')
q.numpy_log = log
q)\l log.p
q)f:{r:count[x]#0n;numpy_log(x;r);r}
q)f 5?1f
-0.9345759 -0.6595362 -0.661688 -0.8997675 -1.725501

Is it safe to extend this as follows with peach:

q)res:f peach (5?1f;5?1f;5?1f)

vincentleung58 commented 7 years ago

@abalkin thanks!

abalkin commented 7 years ago

@trias702 - it is currently not safe to use peach with functions that call python. However, multithreading and asynchronous programming are areas of active research and we hope to provide a better answer in the future releases. Stay tuned!

abalkin commented 7 years ago

NB: See internal issue 889.

abalkin commented 7 years ago

With the improvements that went into PyQ 4.1.0, the all issues raised in this thread have been addressed.