There's this stage where the model loads all the vectors into memory for manipulation. The vectors are stored in numpy arrays. Loading everything into memory leads to a lot of memory consumption...
What if we can replace with something like Arrow, read through DuckDB? Then we get mmaping for free?
Building on #3
In the SLIM model: https://dl.acm.org/doi/abs/10.1145/3539618.3591977
There's this stage where the model loads all the vectors into memory for manipulation. The vectors are stored in numpy arrays. Loading everything into memory leads to a lot of memory consumption...
What if we can replace with something like Arrow, read through DuckDB? Then we get mmaping for free?