man-group / arctic

High performance datastore for time series and tick data
https://arctic.readthedocs.io/en/latest/
GNU Lesser General Public License v2.1
3.05k stars 583 forks source link

Speedup tickstore read date index conversion #1021

Closed BaiBaiHi closed 8 months ago

BaiBaiHi commented 8 months ago

Speedup of Tickstore read datetime index conversion by ~40x.

Overall runtime improvement is around 3-4x on average, but this can vary depending on the mongo latency.

%timeit  pd.DatetimeIndex(np.concatenate(rtn[INDEX]).astype('datetime64[ms]'), tz='UTC')
1.76 s ± 59.1 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)
%timeit pd.to_datetime(np.concatenate(rtn[INDEX]), utc=True, unit='ms')
1min 16s ± 1.74 s per loop (mean ± std. dev. of 7 runs, 1 loop each)

In the current implementation, the datetime conversion takes ~60% of the total runtime:

image

The new implementation drops it down to ~5%

image

Output of both implementations is identical (tested on 22.5 million values).

image

jamesmunro commented 8 months ago

Thanks!