exxeleron / qPython

interprocess communication between Python and kdb+
http://www.devnet.de
Apache License 2.0
152 stars 89 forks source link

Needs utility for casting from qtemporals (or raw temporals) into datetimes #6

Closed buckie closed 9 years ago

buckie commented 9 years ago

Not sure if this is even possible given the IPC protocol used by kdb, but is there a way to transparently cast from q's temporals into numpy datetimes? Preferably, a flag in QConnection.query or a global config that can be set that would automatically cast a query's results which are temporal to numpy datetimes.

I've made a flask-kdb extension using qpython, which for the record was a dream to work with (qpython and flask). To test out it's design, I made an in browser REPL link.

In doing so, the biggest issue I've had is the rendering of tables with datetimes in them:

  1. For qtemporal lists, there's no clean way to convert them. The current solution I've found for temporal lists link to code is little more than a hack.
  2. For tables with temporal value columns in them, there doesn't seem to be any way to know that the column is a temporal. I'm hoping to avoid a naming convention for column names hack to sneak the data in. Even if there were, I'd still be stuck with my temporal list hack.

I'd be welcome to some comments on what to do to resolve this. My current thinking is, if it's a kdb thing, the utility to convert the column could be as simple as running a \t <whatever> to get the metadata I need.

Feel free to close this issue if this is a purely kdb thing that you can't fix.

maciejlach commented 9 years ago

As a solution, I would propose to extend a QTemporalList class with a method which will create new numpy.array with representation via numpy datetime64/timedelta64.

It's possible to introspect the column types from qPython. Instances of QTable provides a meta attribute which contains q type codes per column in a dictionary:

>>> q.sync('table:: ([] pos:`d1`d2`d3;dates:(2001.01.01;2000.05.01;0Nd))')
>>> t = q.sync('table')
>>> print t.meta

metadata(dates=-14, qtype=98, pos=-11)

You can check whether q type code correspond to a temporal value, by testing against the constants defined in the qtype module, e.g.:

is_temporal = t.meta.dates in [QMONTH, QDATE, QDATETIME, QMINUTE, QSECOND, QTIME, QTIMESTAMP, QTIMESPAN]
buckie commented 9 years ago

Thanks for the prompt reply. I stumbled onto meta when I was stepping through with the debugger after sending this. Seems to be what I was looking for. I'm looking into making a conversion utility now and will send over a pull when it's done. In the meantime I have an honest question.

Is there a particular reason for not, in the future, just defaulting all temporals into proper numpy datetimes? I know that numpy's storage of datetimes in inefficient compared to kdb's, but the current state of temporals is cumbersome to use. My bet is that either yall have a specific reason (which I'm curious to hear) or yall just haven't had the time. Please don't misinterpret this question as me being judgmental or something, I'm just genuinely curious.

maciejlach commented 9 years ago

I've included an utility function in qtemporal for conversion between raw temporal array and numpy datetime64/timedelta64.

  >>> raw = numpy.array([366, 121, qnull(QDATE)])
  >>> print array_from_raw_qtemporal(raw, qtype = QDATE)
  ['2001-01-01' '2000-05-01' 'NaT']

Main reason for the representation in raw format was efficiency in memory representation of temporal vectors. Also we plan to allow user to configure mapping in final 1.0 release and choose between representation of temporal lists via either:

sugarmoose commented 9 years ago

thanks for this one - will come in handy over here as well, maybe even the pandas dataframes.

cheers, David.

On 29 September 2014 11:30, Maciej Lach notifications@github.com wrote:

Closed #6 https://github.com/exxeleron/qPython/issues/6 via 885af36 https://github.com/exxeleron/qPython/commit/885af36af3cad5058c0dd0184dc7289f8f471ec5 .

— Reply to this email directly or view it on GitHub https://github.com/exxeleron/qPython/issues/6#event-171401991.

maciejlach commented 9 years ago

You might also want to checkout the SHA: b0ba797b2c57b4d754cb9089eabc4451f69b2775 commit.

qPython can be instrumented to automatically represent q temporal vectors as numpy datetime64/timedelta64 arrays.

>>> v = q.sync("2001.01.01 2000.05.01 0Nd", numpy_temporals = True)
>>> print '%s dtype: %s qtype: %d: %s' % (type(v), v.dtype, v.meta.qtype, v)
<class 'qpython.qcollection.QList'> dtype: datetime64[D] qtype: -14: ['2001-01-01' '2000-05-01' 'NaT']

>>> v = q.sync("2000.01.04D05:36:57.600 0Np", numpy_temporals = True)
>>> print '%s dtype: %s qtype: %d: %s' % (type(v), v.dtype, v.meta.qtype, v)
<class 'qpython.qcollection.QList'> dtype: datetime64[ns] qtype: -12: ['2000-01-04T05:36:57.600000000+0100' 'NaT']

>>> v = q.sync("2001.01.01 2000.05.01 0Nd", numpy_temporals = False)
>>> print '%s dtype: %s qtype: %d: %s' % (type(v), v.dtype, v.meta.qtype, v)
<class 'qpython.qcollection.QTemporalList'> dtype: int32 qtype: -14: [2001-01-01 [metadata(qtype=-14)] 2000-05-01 [metadata(qtype=-14)]
NaT [metadata(qtype=-14)]]

>>> v = q.sync("2000.01.04D05:36:57.600 0Np", numpy_temporals = False)
>>> print '%s dtype: %s qtype: %d: %s' % (type(v), v.dtype, v.meta.qtype, v)
<class 'qpython.qcollection.QTemporalList'> dtype: int64 qtype: -12: [2000-01-04T05:36:57.600000000+0100 [metadata(qtype=-12)]
NaT [metadata(qtype=-12)]]

For details please refer to documentation:

sugarmoose commented 9 years ago

Excellent, about to do a bit of timeseries analysis with qPython for the first time, I'll flip that switch in the calls.

cheers, David.

On 14 October 2014 08:33, Maciej Lach notifications@github.com wrote:

You might also want to checkout the SHA: b0ba797 https://github.com/exxeleron/qPython/commit/b0ba797b2c57b4d754cb9089eabc4451f69b2775 commit.

qPython can be instrumented to automatically represent q temporal vectors as numpy datetime64/timedelta64 arrays.

v = q.sync("2001.01.01 2000.05.01 0Nd", numpy_temporals = True)>>> print '%s dtype: %s qtype: %d: %s' % (type(v), v.dtype, v.meta.qtype, v)<class 'qpython.qcollection.QList'> dtype: datetime64[D] qtype: -14: ['2001-01-01' '2000-05-01' 'NaT'] v = q.sync("2000.01.04D05:36:57.600 0Np", numpy_temporals = True)>>> print '%s dtype: %s qtype: %d: %s' % (type(v), v.dtype, v.meta.qtype, v)<class 'qpython.qcollection.QList'> dtype: datetime64[ns] qtype: -12: ['2000-01-04T05:36:57.600000000+0100' 'NaT'] v = q.sync("2001.01.01 2000.05.01 0Nd", numpy_temporals = False)>>> print '%s dtype: %s qtype: %d: %s' % (type(v), v.dtype, v.meta.qtype, v)<class 'qpython.qcollection.QTemporalList'> dtype: int32 qtype: -14: [2001-01-01 [metadata(qtype=-14)] 2000-05-01 [metadata(qtype=-14)]NaT [metadata(qtype=-14)]] v = q.sync("2000.01.04D05:36:57.600 0Np", numpy_temporals = False)>>> print '%s dtype: %s qtype: %d: %s' % (type(v), v.dtype, v.meta.qtype, v)<class 'qpython.qcollection.QTemporalList'> dtype: int64 qtype: -12: [2000-01-04T05:36:57.600000000+0100 [metadata(qtype=-12)]NaT [metadata(qtype=-12)]]

— Reply to this email directly or view it on GitHub https://github.com/exxeleron/qPython/issues/6#issuecomment-59000724.