Closed shiyanhui closed 2 years ago
This sort of behavior is typically due to 'tag cardinality explosion'. See #3038
Can you verify that you don't have metrics with too many tags?
Yeah, we do have metrics with multiple tags. Seems we likely find the root cause. Will fix it and see whether back to normal. Thanks!
Another question is, it seems that only when we use Spring WebFlux this problem will appear. Not if we use Spring MVC. So is there any way to make micrometer not use reactor
nor reactor-netty
?
As @checketts mentioned above, this is probably caused by using high cardinality tags, the jmap output seems to agree with this, see the Tag and Meter.Id, you have almost 6 million tags (Tag is part of the Id). Using multiple tags is not an issue, using a tag that has high cardinality, is.
Seems we likely find the root cause.
Can I ask what was it?
it seems that only when we use Spring WebFlux this problem will appear. Not if we use Spring MVC.
This is pretty weird, are you sure you don't attach high cardinality tags in one case and do in the other?
So is there any way to make micrometer not use reactor nor reactor-netty ?
I'm not sure I understand how would this help. Micrometer only uses them for its StatsD registry and they are shadowed so theoretically you should not notice it. Can you try using DataDog through the DatadogMeterRegistry
? That one does not use reactor or reactor-netty.
We are using an old version micrometer
1.5.14
, so is it known issue and fixed in later versions?
I can't think of a specific issue, but there have been various changes in the statsd module since that version, so trying with the latest version available to see if the issue remains is a good troubleshooting step.
Problem solved, just want to sync the result here. It's caused by high cardinality tags
in reactor netty. So this issue can be closed now, thanks!
@shiyanhui Were those tags attached by you or by rector/reactor-netty/netty?
@jonatan-ivanov Apparently only reactor-netty adds them, but I don't know how he solved it
Me neither, but I would report that issue to reactor-netty. Also, you can always, remove/modify add tags using a MeterFilter
/ObservationFilter
.
@jonatan-ivanov Is it netty that generates high cardinality? Would deleting the library in the pom be enough?
@jonatan-ivanov I removed these tags but still, the memory consumption is very high
I think Netty is not instrumented (so it can't produce any high cardinality data) and you also said it's reactor-netty that does. :) Did you check it or just read this comment? If it produces data, it means it is used so if you remove it from your dependencies your app might be broken but I don't know how reactor-netty is used in your app.
I'm not sure what you expect from the config changes you made above. They are not tags and have nothing to do with netty or reactor netty. I would report the issue to reactor-netty and in the meantime use a MeterFilter
/ObservationFilter
as I suggested above.
Describe the bug
Hi, recently we observed that our service would make major gc collection frequently after running for days. And we found that it may be related to memory leak of micrometer.
Here are three screenshots, the first is the datadog JVM heap runtime metrics, the second is the datadog thrown profiling, and the third is the heap histogram with command
jmap -histo:live
.All the thrown exceptions shown in the second screenshot point to ResourceLeakDetector. We traced the code and suspected that some buffer objects it requested here NettyOutbound.java may not be released.
We are using an old version micrometer
1.5.14
, so is it known issue and fixed in later versions? If not, can you please help to give some advice? Thanks.Environment
To Reproduce
The Old Gen Size grows slowly, so it will take days to reproduce this on our k8s. We didn't built a minimum app to reproduce it locally.
Expected behavior
Make sure that there are no memory leak for micrometer.