spring-projects / spring-boot

Spring Boot
https://spring.io/projects/spring-boot
Apache License 2.0
74.21k stars 40.47k forks source link

Automatically create metrics for caches that are created on-the-fly #14576

Open joshiste opened 5 years ago

joshiste commented 5 years ago

Currently the CacheMetricsRegistrarConfiguration binds the meters for the cache related metrics in a @PostContstruct for all known cache names.

Looking at the spring-boot-sample-cache project and the various caches implementations, most of them don't provide the cache names at this moment and therefore no metrics are provided for these.

snicoll commented 5 years ago

The description of this issue is what is advertized in the documentation.

most of them don't provide the cache names at this moment

The cache name is a mandatory argument for a cache so I am not sure I understand what you are reporting.

joshiste commented 5 years ago

Ok missed that in the docs. But I wonder if things can be improved?

And also spring-boot-sample-cache should make use of the CacheMetricsRegistrar option (which it currently doesn't). I'd volunteer for a PR on that.

snicoll commented 5 years ago

I don't see how things can be improved at the moment. This was documented after investigating how we could make it better. If you find a way, I am more than happy to review a PR.

As for the cache sample, I still don't understand what you mean, sorry. Not all caches are supported by micrometer so that might be the reason why you don't see metrics. If you have more questions, please ask on StackOverflow or come chat with us on Gitter.

tkvangorder commented 5 years ago

@snicoll I ran into this same issue and had some ideas on how to handle it. I am willing to put a PR forward, but I am looking for guidance on which approach:

The current auto-configuration for cache metrics will only bind caches that are known at start-up. This means that in the case where developers are using @Cacheable, any cache that is created "on-the-fly" will not have its metrics bound to the MeterRegistry.

This is surprising behavior and it seems reasonable to expect all caches, dynamic or otherwise, to have their metrics exposed through micrometer.

This is actually a tricky problem because each CacheManager implementation must be meter-aware. Specifically, a cache should be bound to the registry when CacheManager.getCache(String name) is called. The best place to do this would be in the AbstractCacheManager (in the core Spring Framework), however, the CacheMetricsRegistrar (used to bind a cache to the MeterRegistry) is in the Spring Boot Actuator project.

I can think of two ways to make this work correctly, such that when the actuator library is added to the classpath all caches have their metrics bound to the MeterRegistry :

philwebb commented 5 years ago

I personally prefer to avoid aspects when at all possible, but the event idea sounds quite nice to me.

snicoll commented 5 years ago

I like the idea of the event. Please, create an issue in the framework issue tracker and I'll give that a try.

tkvangorder commented 5 years ago

@snicoll I have created that issue here: https://jira.spring.io/browse/SPR-17350

Thanks for looking into it and please feel free to ping me with any questions.

wilkinsona commented 5 years ago

I've re-opened this so it doesn't fall through the cracks. It's blocked until the Framework issue reaches a conclusion.

snicoll commented 5 years ago

We need a bit more time to investigate what contract is reasonable at that level. I've commented on the framework issue.

tkvangorder commented 5 years ago

After reading the feedback from @snicoll, there is not a concrete mechanism to enforce that each cache management implementation emits an event when a cache is created at startup or dynamically. After thinking about this some more, maybe we can steal a play from spring cloud sleuth: What if the actuator library creates a BeanPostProcessor that wraps the CacheManager in a MetricsAwareCacheManager? This manager can keep track of which names have/have not been bound to the meter registry.

It would be similar to this:

https://github.com/spring-cloud/spring-cloud-sleuth/blob/master/spring-cloud-sleuth-core/src/main/java/org/springframework/cloud/sleuth/instrument/async/AsyncCustomAutoConfiguration.java

tkvangorder commented 5 years ago

I prototyped the use of a BeanPostProcessor to create a wrapper around the CacheManager and while this works, it is quite messy. The proxy has to keep track of which names have/have not been bound to the meter registry and must check that list each time CacheManager.getCache() is called. Additionally, the use of a wrapper will preclude the ability to Autowire the CacheManager by type which is not ideal. It still looks like the best option is to generate an event from each CacheManager implementation.

karesti commented 5 years ago

Hi, sorry if I'm missing something here. I've been playing with the spring boot cache sample. https://github.com/spring-projects/spring-boot/tree/master/spring-boot-samples/spring-boot-sample-cache

I understand the issue of the caches created on the fly. When I run the examples, if I don't add the spring.cache.cache-names=countries on the file, I don't get any caffeine stats on the countries cache. But when I use the hazelcast profile, I would expect to have these stats because the hazelcast.xml file is provided. Infinispan does not work, but that's understandable since it's not supported.

snicoll commented 5 years ago

@karesti please ask questions on StackOverflow or join us on Gitter.

jorgheymans commented 4 years ago

Coming here from #19412 , it's worth mentioning that caches created through @CacheConfig are also considered 'on the fly'.

Even though the documentation says that the metric registry only binds caches that are available at startup. Intuitively, i would have thought that @CacheConfig is also considered as being available at startup.

Only caches that are available on startup are bound to the registry. For caches created on-the-fly or programmatically after the startup phase, an explicit registration is required. A CacheMetricsRegistrar bean is made available to make that process easier.

snicoll commented 4 years ago

@jorgheymans, @CacheConfigis irrelevant here. It's just that you're using a cache provider that is configured to create a cache on the fly if it does not exist. Referencing the name of the cache in your code via @CacheConfig is in no way a registration.

wilkinsona commented 4 years ago

I think it's worth noting this in the existing section in the documentation. I'll re-open #19412 for that purpose.

DanielYWoo commented 4 years ago

Option 1: to send event, or extend from a common base class means all the existing cache manager needs to be changed. Option 2: AOP can avoid changes to the existing classes but it's implicit and error prone. It's a balance. Let's first choose which way we head for.

If we choose option 1, I would prefer changing all the cache managers to extend a new base class than sending event.

We can create a new base class in the actuator project, then the dependency to micrometer is not a problem anymore. We can implement all subclasses for different cache providers in actuator. To use it, we need to check the property "management.metrics.enable", if it's set to all or cache, create the metrics aware cache manager, otherwise use the default cache manager.

To implement this, it's better to extend from the AbstractTransactionAwareCacheManager for several reasons: 1) we can simply override getMissingCache() to register the cache created on the fly. very simple. 2) we don't need to write any code to maintain the "cacheMap" related code. (see AbstractCacheManager) 3) we don't need to write any code to decorate the transaction aware wrapper if user needs it to be tx aware.

However, spring boot is very flexible, users may create their own cache configuration, it's possible to ignore our logic here and create their own cache managers. So what we do here is to provide the metrics aware cache managers and have a reasonable default initialization configuration, but we cannot promise or constrain how users use it.

snicoll commented 4 years ago

Thanks for the suggestion but wrapping the CacheManager is not something I am keen to do as it would change the bean type that gets registered and is going to affect users that expect the cache library type they're using. I've just declined https://github.com/spring-projects/spring-framework/issues/21884 so the event is not an option at this stage either.

I'd like to explore some options using Micrometer itself. I've pinged the team to see if we could do something in that direction.

bert82503 commented 1 week ago

I use spring-boot-starter-parent:2.7.18.

  1. enable cache @EnableCaching

    @Slf4j
    @EnableCaching
    @Configuration(proxyBeanMethods = false)
    @AutoConfigureBefore({RedisAutoConfiguration.class, RedisReactiveAutoConfiguration.class})
    public class RedisConfiguration {
    public RedisConfiguration() {
        log.info("create RedisConfiguration");
    }
    
    // ...
    }
  2. add cache annotation on service method @Cacheable(cacheNames = "user.info", key = "#userId")

    @Slf4j
    @Service
    public class UserServiceImpl implements UserService {
    public UserServiceImpl() {
        log.info("create UserServiceImpl");
    }
    
    @Cacheable(cacheNames = "user.info", key = "#userId")
    @Override
    public UserDto getById(long userId) {
        log.info("getById, userId={}", userId);
        return new UserDto()
                .setId(userId)
                .setPhone("***")
                .setNickName("张三");
    }
    }
  3. add config cacheNames to stats spring.cache.cache-names=user.info

the cacheNames is the same as the value of the @Cacheable(cacheNames = "user.info")

spring:
  # CacheProperties
  cache:
    type: redis
    cache-names: "user.info"
    redis:
      time-to-live: 1d
      key-prefix: "user:"
      use-key-prefix: true
      enable-statistics: true
  1. enable prometheus endpoint /actuator/prometheus and add app tags management.metrics.tags.application="${spring.application.name}"

    management:
    # ManagementServerProperties
    server:
    port: 38081
    base-path: ""
    endpoints:
    # WebEndpointProperties
    web:
      exposure:
        include: "*"
    #        exclude: "heapdump,threaddump,metrics"
    #      base-path: "/actuator"
      # Customizing the Management Endpoint Paths
    #      path-mapping:
    #        "prometheus": "metrics"
    # MetricsProperties
    metrics:
    use-global-registry: true
    #    enable:
    #      heapdump: false
    #      threaddump: false
    tags:
      application: "${spring.application.name}"
    export:
      # JmxProperties
      jmx:
        enabled: true
        domain: "metrics"
        step: 1m
      # PrometheusProperties
      prometheus:
        enabled: true
    #        descriptions: true
        step: 1m
  2. See the final effect

http://localhost:38081/actuator/metrics

/actuator/metrics

{"names":["application.ready.time","application.started.time","cache.gets","cache.lock.duration","cache.puts","cache.removals","disk.free","disk.total","executor.active","executor.completed","executor.pool.core","executor.pool.max","executor.pool.size","executor.queue.remaining","executor.queued","http.server.requests","jvm.buffer.count","jvm.buffer.memory.used","jvm.buffer.total.capacity","jvm.classes.loaded","jvm.classes.unloaded","jvm.gc.live.data.size","jvm.gc.max.data.size","jvm.gc.memory.allocated","jvm.gc.memory.promoted","jvm.gc.overhead","jvm.gc.pause","jvm.memory.committed","jvm.memory.max","jvm.memory.usage.after.gc","jvm.memory.used","jvm.threads.daemon","jvm.threads.live","jvm.threads.peak","jvm.threads.states","lettuce.command.completion","lettuce.command.firstresponse","logback.events","process.cpu.usage","process.files.max","process.files.open","process.start.time","process.uptime","system.cpu.count","system.cpu.usage","system.load.average.1m","tomcat.sessions.active.current","tomcat.sessions.active.max","tomcat.sessions.alive.max","tomcat.sessions.created","tomcat.sessions.expired","tomcat.sessions.rejected"]}

http://localhost:38081/actuator/prometheus

/actuator/prometheus cache_gets_total

/actuator/prometheus cache_puts_total

# HELP cache_lock_duration_seconds The time the cache has spent waiting on a lock
# TYPE cache_lock_duration_seconds gauge
cache_lock_duration_seconds{application="redis-spring-boot-starter-example",cache="user.info",cache_manager="cacheManager",name="user.info",} 0.0

# HELP cache_puts_total The number of entries added to the cache
# TYPE cache_puts_total counter
cache_puts_total{application="redis-spring-boot-starter-example",cache="user.info",cache_manager="cacheManager",name="user.info",} 4.0

# HELP cache_removals_total Cache removals
# TYPE cache_removals_total counter
cache_removals_total{application="redis-spring-boot-starter-example",cache="user.info",cache_manager="cacheManager",name="user.info",} 0.0

# HELP cache_gets_total The number of times cache lookup methods have returned a cached (hit) or uncached (newly loaded or null) value (miss).
# TYPE cache_gets_total counter
cache_gets_total{application="redis-spring-boot-starter-example",cache="user.info",cache_manager="cacheManager",name="user.info",result="pending",} 0.0
cache_gets_total{application="redis-spring-boot-starter-example",cache="user.info",cache_manager="cacheManager",name="user.info",result="miss",} 4.0
cache_gets_total{application="redis-spring-boot-starter-example",cache="user.info",cache_manager="cacheManager",name="user.info",result="hit",} 11.0
bert82503 commented 1 week ago

Good lucky!

Monitoring spring boot cache metrics with Prometheus - stack overflow