man-group / arctic

High performance datastore for time series and tick data
https://arctic.readthedocs.io/en/latest/
GNU Lesser General Public License v2.1
3.06k stars 583 forks source link

TickStore float32 (f4) support #693

Open fl4p-old opened 5 years ago

fl4p-old commented 5 years ago

A cast to float32/f4 before LZ4 compression reduces storage size by 25~40 % and reads use half of memory. For testing I used prices and various other time series where 64-bit precision is not necessary.

I read somewhere in the code that TickStore only supports a limited set of types because of compatibility to a Java version. Is that still relevant?

Here is the drop in code for writing and reading float32:

click to expand ``` def arctic_float32_extension(): import numpy as np import arctic.tickstore.tickstore from arctic.exceptions import UnhandledDtypeException from pandas._libs.lib import infer_dtype def _ensure_supported_dtypes(array): # We only support these types for now, as we need to read them in Java if array.dtype.kind == 'i': array = array.astype('
shashank88 commented 5 years ago

Good point, let me check if @jamesblackburn might know about this.

yschimke commented 5 years ago

Java compatibility is still required.

This shouldn't stop incremental improvements, but they should be done in a forwards and backwards compatible way, e.g. explicit feature flags and/or library metadata.