Modern columnar data format for ML and LLMs implemented in Rust. Convert from parquet in 2 lines of code for 100x faster random access, vector index, and data versioning. Compatible with Pandas, DuckDB, Polars, Pyarrow, and PyTorch with more integrations coming..
Users can set number of items, but that's hard to tune without knowing a lot of internal details about how large metadata is. Easier to instead set the size in bytes.
We already require impl DeepSizeOf, so it should be very little effort to make eviction size based.
We should have a deprecation cycle for the old item-based size. It can assume a fixed-size entry (2MB?) and use that to derive a value for the max bytes.
Users can set number of items, but that's hard to tune without knowing a lot of internal details about how large metadata is. Easier to instead set the size in bytes.
We already require
impl DeepSizeOf
, so it should be very little effort to make eviction size based.We should have a deprecation cycle for the old item-based size. It can assume a fixed-size entry (2MB?) and use that to derive a value for the max bytes.