man-group / arctic

High performance datastore for time series and tick data
https://arctic.readthedocs.io/en/latest/
GNU Lesser General Public License v2.1
3.05k stars 583 forks source link

Reading from TickStore with date_range returns incomplete data in a very specific case #819

Open scoriiu opened 4 years ago

scoriiu commented 4 years ago

Arctic Version

# 1.79.2

Arctic Store

# TickStore

Platform and version

MacOS 10.15

Description of problem and/or code sample that reproduces the issue

Reading from TickStore with date_range where start equals end and the chunks are aligned like bellow, returns incomplete data. Consider the following chunk struct in the TickStore database:

replocal:PRIMARY> db.ticks.find({}, {"s": 1, "e": 1})
{ "_id" : ObjectId("5da041d8a43eb81023711fe5"), "e" : ISODate("2019-10-11T08:51:32.552Z"), "s" : ISODate("2019-10-11T08:48:11.160Z") }
{ "_id" : ObjectId("5da04295a43eb810237120bf"), "e" : ISODate("2019-10-11T08:55:14.898Z"), "s" : ISODate("2019-10-11T08:51:33.225Z") }
{ "_id" : ObjectId("5da04373121e05ca64e80f03"), "e" : ISODate("2019-10-11T08:57:38.476Z"), "s" : ISODate("2019-10-11T08:55:14.898Z") }
{ "_id" : ObjectId("5da0440c100c8aea571b9d66"), "e" : ISODate("2019-10-11T08:58:59.512Z"), "s" : ISODate("2019-10-11T08:57:38.518Z") }
{ "_id" : ObjectId("5da04453100c8aea571b9ddb"), "e" : ISODate("2019-10-11T08:58:59.512Z"), "s" : ISODate("2019-10-11T08:58:59.512Z") }

Now consider the function call: self.lib_ticks.read(self.args.symbol, date_range=DateRange('2019-10-11 08:58:59.512000+00:00', '2019-10-11 08:58:59.512000+00:00'))

This would result in the following query being executed on the collection:

query:<class 'dict'>: {'sy': 'xbtusd', 's': {
    '$gte': datetime.datetime(2019, 10, 11, 8, 58, 59, 512000, tzinfo=tzfile('/usr/share/zoneinfo/UTC')), 
    '$lte': datetime.datetime(2019, 10, 11, 8, 58, 59, 512000, tzinfo=tzutc())
    }}

Which will return the data from the last chunk, but not from the previous one where 'e' matches the date_range as well.

DylanModesitt commented 4 years ago

I run into this issue often with the arctic VersionStore as well when multi-indexes on the frame (repeated dates) cause date ranges to span multiple chunks - reads become incomplete.