Open sergiimk opened 5 years ago
Agreed. In our internal build we're wrapping the storage in an autoRefreshCache
that returns service and operation names from cache and periodically refreshes it from real storage in the background. It's been working pretty well. I can commit that code to github if someone is willing to work on adding a CLI flag for it (the refresh interval) and wiring the cache in the query-service main().
Badger storage loads GetOperations & GetServices only during startup and then updates it rest of the time in the memory (with TTL purging happen during reads).
Generic caching approach would probably return stale replies in this case and perform a bit worse. I would assume something like Kafka / Badger could use subscription based caching (DB writes are automatically refreshing the cache's data), but that's obviously not possible for all the backend types. So using different caching mechanisms probably has its place.
The caching would be optional. And it only makes sense in installations that have a lot of services and operations. In our case we have thousands of services and 10s of thousands of operations. However, that data is pretty static.
An up to date remote cache shared between collectors and query services feels like an overkill.
@yurishkuro I know this response it about a year+ late, but I'd be happy to take a stab at getting this cache pushed over the finish line. Would you mind sharing what you have so far?
At Uber there was a decorator for storage.Reader that cached services/operations response and returned cached versions to the UI, while having a timer loop in the background refreshing the cache every 1min or so.
Requirement - what kind of business use case are you trying to solve?
We are implementing a custom gRPC-based storage plugin as per this doc.
Problem - what in Jaeger blocks you from solving the requirement?
The gRPC storage plugin is currently called upon every single UI interaction. For example refreshing the main page will call
GetServices
andGetOperations
. In majority of cases these operations will involve costly external calls and performing them for every user of Jaeger UI will quickly become a massive bottleneck. This means that implementing a usable plugin currently requires adding a lot of complex caching logic directly into the plugin.Proposal - what do you suggest to solve the problem or improve the existing situation?
Jaeger already includes several implementations of caching, but they are specific to different storage backends. It would be great if generic caching logic existed in between Jaeger and a storage plugin, so that when implementing a plugin you didn't have to worry about caching the results and could focus on data access.
Any open questions to address