Coreoz / Plume

The lightweight modular Java project environment
Apache License 2.0
25 stars 7 forks source link

Provide more health metrics #33

Open amanteaux opened 6 months ago

amanteaux commented 6 months ago

Il faudrait ajouter des métriques sur :

Ces métriques doivent pouvoir être récupérées facilement :

Il faudrait repartir de la branche more-metrics : https://github.com/Coreoz/Plume/tree/more-metrics

Wail-LT commented 6 months ago

Hello @amanteaux,

On pourrait ajouter Micrometer à la stack Plume, ça nous permettra d'envoyer ces métriques à tout type de système de monitoring externe (Datadog, Prometheus, OTL...).

Qu'en penses-tu ?

Wail-LT commented 1 month ago

Hello @amanteaux,

J'ai pu trouver le temps d'avancer sur le sujet. Les modifications ont été réalisées sur la branche more-metrics cf. commit

Afin de valider les modifications appliquées, j'ai testé la lib sur la caisse, voici le nouveau log contenant l'ensemble des metrics exposées par la lib :

NEW App health
 HikariPool-1.pool.ActiveConnections=0
 HikariPool-1.pool.IdleConnections=1
 HikariPool-1.pool.MaxConnections=4
 HikariPool-1.pool.MinConnections=1
 HikariPool-1.pool.PendingConnections=0
 HikariPool-1.pool.TotalConnections=1
 http-pool.current-size=0
 http-pool.max-size=50000
 http-pool.usage=0.0
 http-pool.usage-size=0
 http-pool.waiting-size=0
 memory-usage.heap.committed=318767104
 memory-usage.heap.init=536870912
 memory-usage.heap.max=8589934592
 memory-usage.heap.usage=0.018537893891334534
 memory-usage.heap.used=159239296
 memory-usage.non-heap.committed=75431936
 memory-usage.non-heap.init=7667712
 memory-usage.non-heap.max=-1
 memory-usage.non-heap.usage=-7.3629896E7
 memory-usage.non-heap.used=73630728
 memory-usage.pools.CodeHeap-'non-nmethods'.committed=2555904
 memory-usage.pools.CodeHeap-'non-nmethods'.init=2555904
 memory-usage.pools.CodeHeap-'non-nmethods'.max=5840896
 memory-usage.pools.CodeHeap-'non-nmethods'.usage=0.24406118513323982
 memory-usage.pools.CodeHeap-'non-nmethods'.used=1425536
 memory-usage.pools.CodeHeap-'non-profiled-nmethods'.committed=3735552
 memory-usage.pools.CodeHeap-'non-profiled-nmethods'.init=2555904
 memory-usage.pools.CodeHeap-'non-profiled-nmethods'.max=122908672
 memory-usage.pools.CodeHeap-'non-profiled-nmethods'.usage=0.030097143999733397
 memory-usage.pools.CodeHeap-'non-profiled-nmethods'.used=3699200
 memory-usage.pools.CodeHeap-'profiled-nmethods'.committed=13434880
 memory-usage.pools.CodeHeap-'profiled-nmethods'.init=2555904
 memory-usage.pools.CodeHeap-'profiled-nmethods'.max=122908672
 memory-usage.pools.CodeHeap-'profiled-nmethods'.usage=0.10883502182824008
 memory-usage.pools.CodeHeap-'profiled-nmethods'.used=13376768
 memory-usage.pools.Compressed-Class-Space.committed=5963776
 memory-usage.pools.Compressed-Class-Space.init=0
 memory-usage.pools.Compressed-Class-Space.max=1073741824
 memory-usage.pools.Compressed-Class-Space.usage=0.005373515188694
 memory-usage.pools.Compressed-Class-Space.used=5769768
 memory-usage.pools.G1-Eden-Space.committed=192937984
 memory-usage.pools.G1-Eden-Space.init=29360128
 memory-usage.pools.G1-Eden-Space.max=-1
 memory-usage.pools.G1-Eden-Space.usage=0.6304347826086957
 memory-usage.pools.G1-Eden-Space.used=121634816
 memory-usage.pools.G1-Eden-Space.used-after-gc=0
 memory-usage.pools.G1-Old-Gen.committed=117440512
 memory-usage.pools.G1-Old-Gen.init=507510784
 memory-usage.pools.G1-Old-Gen.max=8589934592
 memory-usage.pools.G1-Old-Gen.usage=0.003732919692993164
 memory-usage.pools.G1-Old-Gen.used=32065536
 memory-usage.pools.G1-Old-Gen.used-after-gc=20754432
 memory-usage.pools.G1-Survivor-Space.committed=8388608
 memory-usage.pools.G1-Survivor-Space.init=0
 memory-usage.pools.G1-Survivor-Space.max=-1
 memory-usage.pools.G1-Survivor-Space.usage=0.6602935791015625
 memory-usage.pools.G1-Survivor-Space.used=5538944
 memory-usage.pools.G1-Survivor-Space.used-after-gc=5538944
 memory-usage.pools.Metaspace.committed=49741824
 memory-usage.pools.Metaspace.init=0
 memory-usage.pools.Metaspace.max=-1
 memory-usage.pools.Metaspace.usage=0.9923206676136364
 memory-usage.pools.Metaspace.used=49359840
 memory-usage.total.committed=394199040
 memory-usage.total.init=544538624
 memory-usage.total.max=8589934591
 memory-usage.total.used=232871560
 thread-states.blocked.count=0
 thread-states.count=56
 thread-states.daemon.count=25
 thread-states.deadlock.count=0
 thread-states.deadlocks=[]
 thread-states.new.count=0
 thread-states.runnable.count=19
 thread-states.terminated.count=0
 thread-states.timed_waiting.count=11
 thread-states.waiting.count=27

Implementation :

@Slf4j
@Singleton
public class ApplicationHealthService {
    private final MetricRegistry metricRegistry;

    @Inject
    public ApplicationHealthService(
        GrizzlyThreadPoolProbe grizzlyThreadPoolProbe,
        DataSource dataSource
    ) {
        this.metricRegistry = new MetricsCheckBuilder()
            .registerJvmMetrics()
            .registerHikariMetrics((HikariDataSource) dataSource)
            .registerGrizzlyMetrics(grizzlyThreadPoolProbe)
            .build();
    }

    public void logApplicationHealth() {
        SortedMap<String, Gauge> gaugesMetrics = metricRegistry.getGauges();

        logger.info(
            "NEW App health {}",
            gaugesMetrics
                .entrySet()
                .stream()
                .map(entry -> StructuredArguments.keyValue(entry.getKey(), entry.getValue().getValue()))
                .toList()
        );
    }
}

Qu'en penses-tu ? Si tu veux en discuter, n'hésite pas à me ping.