man-group / arctic

High performance datastore for time series and tick data
https://arctic.readthedocs.io/en/latest/
GNU Lesser General Public License v2.1
3.06k stars 583 forks source link

VersionStore min and max dates #905

Closed pxlogpx closed 1 year ago

pxlogpx commented 3 years ago

Is there a way to get the minimum and maximum dates of a dataframe stored in Versonstore ? The dataframe has a datetime type of index. Something similar to lib.min_date and lib.max_date in TickStore ?

shashank88 commented 3 years ago

https://github.com/man-group/arctic/blob/master/arctic/store/version_store.py#L386 should return the date range iirc.

pxlogpx commented 3 years ago

https://github.com/man-group/arctic/blob/master/arctic/store/version_store.py#L386 should return the date range iirc.

It doesn't, see below:

Capture

jasonlocal commented 2 years ago

@pxlogpx I don't think VersionStore stores min/max date of dataframe out of box, but one solution could be updating the code and have it store the min/max date in the metadata when calling VersionStore.write() method. I've tweaked the code to support this, and the following code snippet shows the evidence. Please let me know if this solution resolves your issue

In [8]: df
Out[8]:
            values
date
2022-01-30       1
2022-01-31       2

In [9]: lib.write('test_min_max_mata', df)
Out[9]: VersionedItem(symbol=test_min_max_mata,library=arctic.test_lib,data=<class 'NoneType'>,version=2,metadata={'min_date': Timestamp('2022-01-30 00:00:00'), 'max_date': Timestamp('2022-01-31 00:00:00')},host=localhost)

In [10]: lib.read('test_min_max_mata').data
Out[10]:
            values
date
2022-01-30       1
2022-01-31       2

In [11]: lib.read_metadata('test_min_max_mata').metadata['min_date']
Out[11]: datetime.datetime(2022, 1, 30, 0, 0)

In [12]: lib.read_metadata('test_min_max_mata').metadata['max_date']
Out[12]: datetime.datetime(2022, 1, 31, 0, 0)