Currently, the parquet cache is only populated at the time a parquet file is persisted. Therefore, in the event of a server restart, recent parquet files will not be cached, and only newly written parquet files created after the restart will be cached.
There should be a way to pre-populate the cache on server start, when loading the recent snapshot files.
There should be some configured limits on this, e.g.,
limit on number of files to request from object store to be cached
limit on time to look back for files to be cached
If this would be too expensive with the object store, then we could consider modifying the cache to only cache files as needed, i.e., when they are requested.
Currently, the parquet cache is only populated at the time a parquet file is persisted. Therefore, in the event of a server restart, recent parquet files will not be cached, and only newly written parquet files created after the restart will be cached.
There should be a way to pre-populate the cache on server start, when loading the recent snapshot files.
There should be some configured limits on this, e.g.,
If this would be too expensive with the object store, then we could consider modifying the cache to only cache files as needed, i.e., when they are requested.