Closed severo closed 3 years ago
Since https://github.com/huggingface/datasets-preview-backend/releases/tag/0.8.6, the cache is persisted (in /home/hf/.cache/datasets_preview_backend
by default). make warm
allows to warm the cache, but when I launched it on the server I lost access to it (see https://betteruptime.com/team/14149/incidents/166660371). We can check the current state of the warming process by querying:
Done
Warm the cache at application startup. We want:
to avoid blocking the application, so: run asynchronously, and without hammering the server
to have a warm cache as fast as possible (persisting the previous cache, then refreshing it at startup? - related: #35 )
[x] create a function to list all the datasets and fill the cache for all the possible requests for it. It might be
make benchmark
, or a specific function ->make warm
[x] persist the cache? or start with an empty cache when the application is restarted? -> yes, persisted
[x] launch it at application startup -> it's done at startup, see INSTALL.md.