exxeleron / qPython

interprocess communication between Python and kdb+
http://www.devnet.de
Apache License 2.0
152 stars 89 forks source link

Infinite Date Fails With Pandas version 0.20 and above #56

Open derekwisong opened 6 years ago

derekwisong commented 6 years ago

I also tried in the latest 0.21.0 which was just released and the same result:

In [8]: pd.__version__
Out[8]: u'0.21.0'

Demonstration of failure:

In [1]: import qpython.qconnection
In [2]: qc = qpython.qconnection.QConnection(host="****", port=32423)
In [3]: qc.open()
In [4]: qc.sync('([]date:2017.01.01 0N 0Wd)', pandas=True)
OutOfBoundsDatetime: Out of bounds nanosecond timestamp: -5877611-06-21 00:00:00

Demonstration of proper functionality when infinity is removed:

In [5]: qc.sync('([]date:2017.01.01 0Nd)', pandas=True)
Out[5]: 
        date
0 2017-01-01
1        NaT

According to the release notes (http://pandas.pydata.org/pandas-docs/version/0.20.3/whatsnew.html) pandas has added bounds checking to pd.Timestamp()

Demonstration of it working (in a manner, it still overflows to some date generally less than today which is probably undesirable, albeit not crashing) in 0.19:

In [1]: import pandas as pd

In [2]: pd.__version__
Out[2]: u'0.19.2'
In [3]: import qpython.qconnection
In [4]: qc = qpython.qconnection.QConnection(host="*****", port=32423)
In [5]: qc.open()
In [6]: qc.sync('([]date:2017.01.01 0N 0Wd)', pandas=True)
Out[6]: 
                           date
0 2017-01-01 00:00:00.000000000
1                           NaT
2 1834-02-05 07:42:50.670153728

This also applies to negative infinity. I presume that this also applies to the other temporal types as well.

For pandas, the dates will have to be bounded by the following range, or this exception will be thrown.

In [7]: pd.Timestamp.min
Out[7]: Timestamp('1677-09-21 00:12:43.145225')

In [8]: pd.Timestamp.max
Out[8]: Timestamp('2262-04-11 23:47:16.854775807')