TileDB-Inc / TileDB

The Universal Storage Engine
https://tiledb.com
MIT License
1.84k stars 186 forks source link

cache for cloud-backend #1366

Open mikejiang opened 5 years ago

mikejiang commented 5 years ago

When a read query causes a tile to be read from disk, TileDB places the uncompressed tile in an in-memory LRU cache associated with the query’s context. When subsequent read queries in the same context request that tile, it may be copied directly from the cache instead of re-fetched from disk

I assume this memory cache automatically applies to cloud-backend? Is there any extra cache mechanism to store the larger-than-available-memory on local disk to avoid network data flow for the visited tile?

stavrospapadopoulos commented 5 years ago

Not at the moment, but that would be a great feature (we had raised that internally a while ago). We'll keep the issue open and fit it in our roadmap. Thanks for pointing it out.

mikejiang commented 5 years ago

Just to clarify, in-memory cache is already available for cloud, but not disk-cache, right?

jakebolewski commented 5 years ago

Yes that is correct. What really would be needed would be a persistent LRU cache (available across all TileDB processes / threads per node) which is achievable through use of a embedded KV database or something similar.

mikejiang commented 5 years ago

that will be fantastic!

mstump commented 4 years ago

I'd like to request that the cache be implemented as a plugin type architecture (function pointer, or array of function pointers), that way we could use disk, memcache, or a series of tiered caches.