rom1504 / embedding-reader

Efficiently read embedding in streaming from any filesystem
MIT License
92 stars 19 forks source link

faster numpy parquet #25

Closed rom1504 closed 2 years ago

rom1504 commented 2 years ago

this is faster because it reuses the same pyarrow table instead of rereading all parquet files many times (parquet cannot be sliced on disk)