quarkusio / quarkus

Quarkus: Supersonic Subatomic Java.
https://quarkus.io
Apache License 2.0
13.58k stars 2.63k forks source link

OOM in tests since Quarkus 3.13.0 (Part 2) #42355

Closed mschorsch closed 1 month ago

mschorsch commented 1 month ago

Describe the bug

As described in https://github.com/quarkusio/quarkus/issues/42303 we get an OOM in our tests.

Because https://github.com/quarkusio/quarkus/issues/42303 doesn't fix the problem completly i've opened a new issue with a new reproducer.

Compared to the reproducer in https://github.com/quarkusio/quarkus/issues/42303 i have added more dependencies.

Expected behavior

No response

Actual behavior

No response

How to Reproduce?

oom-reproducer2.zip

./gradlew test --console=plain

⚠️ The reproducer uses 999-SNAPSHOT instead of 3.13.1 because 3.13.1 is not available in Maven Central yet.

Output of uname -a or ver

No response

Output of java -version

Java 21

Quarkus version or git rev

Quarkus 3.13.1

Build tool (ie. output of mvnw --version or gradlew --version)

Gradle 8.8

Additional information

No response

geoand commented 1 month ago

I'll take a look

geoand commented 1 month ago

Although I am yet to determine the actual cause of the problem, I have however been able to narrow it down to the quarkus-micrometer extension

geoand commented 1 month ago

I think I know what the problem is, but it will take a look while before I am able to test my theory

geoand commented 1 month ago

Seems that there are more than one issue... I'll likely continue tomorrow

geoand commented 1 month ago

Would you be able to test #42369?

For whatever reason I could not get Gradle to use my changes...

mschorsch commented 1 month ago

I've tested https://github.com/quarkusio/quarkus/pull/42369 against our real application and still get an OOM.

Tomorrow i can try the reproducer.

geoand commented 1 month ago

Thanks for checking.

Looking forward to hearing if the OOM still happens with the reproducer

geoand commented 1 month ago

I was able to reproduce the problem in the reproducer even with my fix - so obviously there is even more to it.

geoand commented 1 month ago

I closed https://github.com/quarkusio/quarkus/pull/42369 because it does not fix the issue.

I was however able to figure out to pinpoint the problem to the JVM and System related binders. Those use JMX which does not play nicely at all with multiple classloaders. One option here would be to disable these by default and have users explicitly opt-in to them. @ebullient WDYT?

cc @gsmet who was also interested in the outcome of this.

geoand commented 1 month ago

@mschorsch my comment above means that you can get the reproducer working by setting:

quarkus.micrometer.binder.jvm=false
quarkus.micrometer.binder.system=false
gsmet commented 1 month ago

Ah, that's funny, I actually identified them as problematic here: https://github.com/quarkusio/quarkus/issues/41233

Is it what you're seeing?

I really think we should try to tackle this issue and make this an error so that we can catch further issues.

geoand commented 1 month ago

Is it what you're seeing?

All JMX related Micrometer binder are causing the same issue. (see https://github.com/quarkusio/quarkus/pull/42388)

I really think we should try to tackle this issue

What do you mean exactly, I don't follow

I really think we should try to tackle this issue and make this an error so that we can catch further issues.

Even if that is the case, it won't be done by me as I'm going to be off for a couple weeks :). Furthermore, in this specific case, I am pretty sure there is nothing we can do (with one caveat that I need to look into).

geoand commented 1 month ago

Turns out that this can be fixed and rather easily...

https://github.com/quarkusio/quarkus/pull/42388 fixes the reproducer.

mschorsch commented 1 month ago

@mschorsch my comment above means that you can get the reproducer working by setting:

quarkus.micrometer.binder.jvm=false
quarkus.micrometer.binder.system=false

I can confirm that this does indeed prevent the OOM in the reproducer 👍 .

In our real application, however, we still get an OOM even though I have deactivated the micrometer extension 😬 .

quarkus:
  micrometer:
    enabled: false
    binder:
      jvm: false
      system: false

Even the complete removal of io.quarkus:quarkus-micrometer-registry-prometheus did not help.

I will test https://github.com/quarkusio/quarkus/pull/42388 but I suspect that there are other issues that are not covered by the repeoducer.

geoand commented 1 month ago

I will test https://github.com/quarkusio/quarkus/pull/42388 but I suspect that there are other issues that are not covered by the repeoducer.

If you do find more OOMs, please open new issues with the relevent reproducers. Thanks!

mschorsch commented 1 month ago

I can confirm that https://github.com/quarkusio/quarkus/pull/42388 fixes the issue the reproducer 👍 .

In our real application we still get an OOM, seems a part 3 is needed...

Thanks for your work @geoand @gsmet

geoand commented 1 month ago

🙏🏼

ebullient commented 1 month ago

I closed #42369 because it does not fix the issue.

I was however able to figure out to pinpoint the problem to the JVM and System related binders. Those use JMX which does not play nicely at all with multiple classloaders. One option here would be to disable these by default and have users explicitly opt-in to them. @ebullient WDYT?

cc @gsmet who was also interested in the outcome of this.

I know you're chasing other things, but I think not having this by default would be a surprise.

geoand commented 1 month ago

In any case, that proposal is no longer necessary :)