Dataset and modelling infrastructure for modelling "event streams": sequences of continuous time, multivariate events with complex internal dependencies.
I got a big performance boost (85% faster execution of write_parquet calls on my dataset when running build_dataset script) by using pyarrow. It may have to do with all the strings in my dataset.
df.write_parquet(fp, use_pyarrow=True)
It would be helpful to have a configuration options for setting the parameters of various polars functions.
I got a big performance boost (85% faster execution of write_parquet calls on my dataset when running build_dataset script) by using pyarrow. It may have to do with all the strings in my dataset.
It would be helpful to have a configuration options for setting the parameters of various polars functions.