Open soxofaan opened 2 years ago
some pointers related to question whether zookeeper right choice for state/caching (aka I want to close some browser tabs):
This is also related to #2 (under which I'm currently working on a caching layer in ZooKeeper)
It looks to me that Zookeeper is more used for configuration/coordination with slow writes but fast reads. I believe we should shift our focus to Redis or Elasticsearch for the caching layer. I mainly include ElasticSearch because it is already well established at VITO.
However the main use case of Elasticsearch is its search functionality for monitoring (logs), analytics, content search. While its use case as a cache is just 'ok', and actually requires some extra work. So I'm convinced the Redis is likely the best option, it's exactly designed for our use case and doesn't require that much maintenance.
Caching is for now implemented in Zookeeper. For Job state, we're looking at ElasticSearch.
So not sure if there's anything left to do here?
Caching is indeed implemented in Zookeeper (because that was already available), but practical use shows that this is not a long term solution. At the moment this problem is mitigated a bit by adding a second level cache (per process, in-memory dicts), which helps, but also has disadvantages (we're back at the original problem of #2, hard to invalidate, cache is again duplicated in isolated silos).
So the need for a better caching system, e.g. with redis, is still on the table, but the problem is less urgent
from #52