Closed zakkak closed 3 hours ago
/cc @brunobat (micrometer), @ebullient (micrometer), @geoand (kubernetes), @iocanel (kubernetes), @radcortez (config)
https://github.com/quarkusio/quarkus/issues/44220 was closed as duplicate, but our application cannot be build into native executable at all, while what this issue says is: The purpose of this issue is to bring awareness of this increase and keep track of it.
Are you sure this is acceptable solution? If on my workstation that is very strong I cannot build application anymore, is it appropriate action?
From what I can see you have a Xmx build limit:
- 4.44GB of memory (14.3% of 31.10GB system memory, set via '-Xmx5g')
Try to raise that and see how it goes. Not saying it's not an issue that needs fixing or at least diagnosing.
@zakkak do you have any idea how we could understand what's going on?
From what I can see you have a Xmx build limit:
- 4.44GB of memory (14.3% of 31.10GB system memory, set via '-Xmx5g')
Try to raise that and see how it goes.
Setting it to 7g
did the trick. I'll try experiment with it if I can set even less. Thank you
Setting quarkus.native.native-image-xmx
to 6g
failes over OOM.
@zakkak do you have any idea how we could understand what's going on?
One would need to attach a Java profiler to native-image
(it's a Java application) and see what's causing the additional allocations. Note that this might not be trivial through as the increased number of allocations could be the result of some new call-paths being explored by the analysis.
I wonder if we were close to the limit, and the change tipped it off. Mapping do add an additional class per root and group.
I could misunderstood this, but in case of the QE TS many-extensions
module, it used to work with 5g
and now 6g
is not enough, so close
is fuzzy term, you still may be right. BTW we had several other modules affected, I just had to raise 5g
limit for quite simple Funqy serverless test in native from 4g
.
Are all the modules using the kubernetes
extension?
Yes, all of them use kubernetes
or openshift
extensions.
Experiment on my MBP M3 with xmx set to 8g. Quarkus 3.16, 3.15, 3.14 can handle "many extensions" module from https://github.com/quarkus-qe/beefy-scenarios with peak RSS ~ 5GB. Quarkus main (https://github.com/quarkusio/quarkus/commit/7a6b0eea9b8) utilizes full available memory and spends massive time on GC activity
sdk use java 23.1.5.r21-mandrel
mvn clean package -DskipTests -Dnative -Dquarkus.native.native-image-xmx=8g -f '003-quarkus-many-extensions' -Dquarkus.platform.version=999-SNAPSHOT -Dquarkus-plugin.version=999-SNAPSHOT | grep -e "Peak RSS" -e "Finished generating"
mvn clean package -DskipTests -Dnative -Dquarkus.native.native-image-xmx=8g -f '003-quarkus-many-extensions' -Dquarkus.platform.version=3.16.2 -Dquarkus-plugin.version=3.16.2 | grep -e "Peak RSS" -e "Finished generating"
mvn clean package -DskipTests -Dnative -Dquarkus.native.native-image-xmx=8g -f '003-quarkus-many-extensions' -Dquarkus.platform.version=3.15.1 -Dquarkus-plugin.version=3.15.1 | grep -e "Peak RSS" -e "Finished generating"
mvn clean package -DskipTests -Dnative -Dquarkus.native.native-image-xmx=8g -f '003-quarkus-many-extensions' -Dquarkus.platform.version=3.14.4 -Dquarkus-plugin.version=3.14.4 | grep -e "Peak RSS" -e "Finished generating"
Using java version 23.1.5.r21-mandrel in this shell.
48.2s (45.7% of total time) in 279 GCs | Peak RSS: 7.47GB | CPU load: 8.25
Finished generating '003-quarkus-many-extensions-1.0.0-SNAPSHOT-runner' in 1m 45s.
5.2s (8.9% of total time) in 499 GCs | Peak RSS: 5.30GB | CPU load: 7.84
Finished generating '003-quarkus-many-extensions-1.0.0-SNAPSHOT-runner' in 58.7s.
6.1s (10.1% of total time) in 936 GCs | Peak RSS: 4.93GB | CPU load: 7.81
Finished generating '003-quarkus-many-extensions-1.0.0-SNAPSHOT-runner' in 1m 0s.
5.7s (9.8% of total time) in 742 GCs | Peak RSS: 4.88GB | CPU load: 7.89
Finished generating '003-quarkus-many-extensions-1.0.0-SNAPSHOT-runner' in 58.2s.
@maxandersen @cescoffier, could you please assign somebody to investigate more in detail what causes such memory demand when building a native image?
Set of defined extensions: https://github.com/quarkus-qe/beefy-scenarios/blob/main/003-quarkus-many-extensions/pom.xml
Side note: I also experimented with different versions of Mandrel (23.1.4.r21-mandrel, 23.1.5.r21-mandrel, 24.0.2.r22-mandrel, 24.1.1.r23-mandrel). I noticed a difference between 23.1.4.r21-mandrel and 23.1.5.r21-mandrel when the .4 version needed more memory and more GC cycles.
The thing to investigate deeper is kubernetes extension as hinted in comments above.
Peak RSS goes to ~5 GB level with 999-SNAPSHOT when kubernetes extension is removed from 003-quarkus-many-extensions/pom.xml
sdk use java 23.1.5.r21-mandrel
mvn clean package -DskipTests -Dnative -Dquarkus.native.native-image-xmx=8g -f '003-quarkus-many-extensions' -Dquarkus.platform.version=999-SNAPSHOT -Dquarkus-plugin.version=999-SNAPSHOT | grep -e "Peak RSS" -e "Finished generating"
mvn clean package -DskipTests -Dnative -Dquarkus.native.native-image-xmx=8g -f '003-quarkus-many-extensions' -Dquarkus.platform.version=3.16.2 -Dquarkus-plugin.version=3.16.2 | grep -e "Peak RSS" -e "Finished generating"
Using java version 23.1.5.r21-mandrel in this shell.
7.8s (10.9% of total time) in 1372 GCs | Peak RSS: 4.95GB | CPU load: 7.88
Finished generating '003-quarkus-many-extensions-1.0.0-SNAPSHOT-runner' in 1m 11s.
7.0s (10.3% of total time) in 1264 GCs | Peak RSS: 4.48GB | CPU load: 7.80
Finished generating '003-quarkus-many-extensions-1.0.0-SNAPSHOT-runner' in 1m 7s.
I think we will need some help from the Mandrel team here @zakkak @galderz @Karm as something in the new code seems to trigger an unwanted behavior of Mandrel/GraalVM but I think we will need some Mandrel/GraalVM knowledge to try to figure out what's going on.
Something to consider is that we never noticed such behavior with other extensions using mappings, and we have already used them for some time. So, it may be something only related to the Kubernetes configuration.
I'm going to try to investigate, but I agree with @gsmet that it would be helpful if someone from Mandrel also helped.
@radcortez I am happy to work with you on this. I have it in my TODO list anyway but need to find some free time for it. Let's sync on Zulip. Note: anyone else willing to jump in is very much welcome on my side :)
Sounds good :)
@rsvoboda https://github.com/quarkus-qe/beefy-scenarios what it is and used for? I wasn't aware of this
https://github.com/quarkus-qe/beefy-scenarios is test coverage that we built during RHBQ 1.x times (together with https://github.com/quarkus-qe/quarkus-openshift-test-suite), with RHBQ 2+ we moved to https://github.com/quarkus-qe/quarkus-test-suite (with https://github.com/quarkus-qe/quarkus-test-framework/) which helped us to simplify test development for both baremetal and OpenShift target environments.
https://github.com/quarkus-qe/beefy-scenarios is kept around for continuous community testing with Quarkus main, it's also part of ecosystem CI / https://status.quarkus.io.
Ok, I think I've got something:
First, one thing to consider is that changing the config classes is not an issue directly. In fact, in this particular case, the config classes are available only at build time and are not even available during the native image build. So, what is the issue?
In our generated startup Config
, we match the configuration to determine if it is build time or runtime to warn about possible issues. As you know, build-time configuration may be present in application.properties
, but we have to ignore it. If we don't have the config classes because they are in the deployment
module, we need to do it another way.
The old config implementation would generate a method similar to this, which either ignores keys or populates the config object:
private static void siParseKey_quarkus_test_container/* $FF was: siParseKey:quarkus:test:container*/(SmallRyeConfig var0, NameIterator var1) {
if (!var1.hasNext()) {
String var2 = var1.getName();
((HashSet)unknown).add(var2);
} else if (!var1.nextSegmentEquals("additional-exposed-ports")) {
if (!var1.nextSegmentEquals("labels")) {
if (!var1.nextSegmentEquals("network")) {
if (!var1.nextSegmentEquals("volume-mounts")) {
String var3 = var1.getName();
((HashSet)unknown).add(var3);
} else {
var1.next();
siParseKey:quarkus:test:container:volume-mounts(var0, var1);
}
} else {
var1.next();
emptyParseKey(var0, var1);
}
} else {
var1.next();
siParseKey:quarkus:test:container:labels(var0, var1);
}
} else {
var1.next();
siParseKey:quarkus:test:container:additional-exposed-ports(var0, var1);
}
}
The new implementation adds the names to a KeyMap
to do the matching instead because it is SmallRyeConfig
that populates the object; we only need to know if a build time property is matched or not and ignore it:
KeyMap var0 = new KeyMap();
var0.findOrAdd("quarkus.knative.rbac.roles.*.policy-rules.*.resource-names").putRootValue(true);
var0.findOrAdd("quarkus.knative.init-containers.*.env-vars.*.value").putRootValue(true);
var0.findOrAdd("quarkus.rest-client.proxy-address").putRootValue(true);
var0.findOrAdd("quarkus.kubernetes.init-containers.*.mounts.*.name").putRootValue(true);
var0.findOrAdd("quarkus.kubernetes.app-secret").putRootValue(true);
var0.findOrAdd("quarkus.kubernetes.prometheus.port").putRootValue(true);
...
So, when config is moved from the old implementation to the new one, the population method is deleted, and new entries are added to the KeyMap.
I was already aware that KeyMap
had some runtime performance issues. Still, those were not very noticeable because the number of entries was low. With the continuous move to the new implementation, the entries started to grow. Apparently, this also greatly impacts the memory of the image build (which I was not aware of, but due to the large number of Kubernetes config entries, it became very noticeable).
I already had a more lightweight replacement implemented in https://github.com/smallrye/smallrye-config/blob/3edebd4b38f85aaaf3c1872845a2add89b6374ad/implementation/src/main/java/io/smallrye/config/PropertyName.java, and replaced the KeyMap
match in Quarkus with https://github.com/quarkusio/quarkus/pull/44448.
I was able to observe the memory increase by building the native image with the #44216, and I do see the memory decrease with #44448:
Current:
17.1s (13.0% of total time) in 178 GCs | Peak RSS: 8.05GB | CPU load: 10.12
With #44448:
9.3s (8.6% of total time) in 156 GCs | Peak RSS: 6.62GB | CPU load: 10.20
@zakkak @michalvavrik, can you try that and see if you notice any improvement? Thanks!
Very nice @radcortez!
Great work @radcortez!
I confirm that the memory usage has decreased significantly
Excellent.
I was now trying to compare to before (https://github.com/quarkusio/quarkus/issues/44216) to see if we are the same or if there is still some degradation. I do get some minor differences, but maybe that is only noise.
BTW, for additional reference, the implementation of https://github.com/quarkusio/quarkus/pull/44448, is something like:
private static Set generateMappedProperties() {
HashSet var0 = new HashSet();
PropertyName var1 = new PropertyName("quarkus.config-tracking.file");
var0.add(var1);
PropertyName var2 = new PropertyName("quarkus.tls.*.trust-store.pem.certs");
var0.add(var2);
PropertyName var3 = new PropertyName("quarkus.log.socket.filter");
var0.add(var3);
PropertyName var4 = new PropertyName("quarkus.vertx.resolver.rotate-servers");
var0.add(var4);
PropertyName var5 = new PropertyName("quarkus.vertx.cluster.public-port");
var0.add(var5);
PropertyName var6 = new PropertyName("quarkus.log.file.async");
var0.add(var6);
PropertyName var7 = new PropertyName("quarkus.log.handler.syslog.*.syslog-type");
...
The check just calls a contains
on the Set
to match the name.
And the Set
is stored in the image heap for native image or you generate it even for native image?
It is stored, because we need it for both phases.
I'm still unsure if this would be the best approach, (I'm still playing with it), because after we start and validate we don't need this anymore.
Just for the record, Peak RSS
is in the same level as 3.16.2 with https://github.com/quarkusio/quarkus/pull/44448 PR for the case mentioned in https://github.com/quarkusio/quarkus/issues/44216#issuecomment-2469879390 @michalvavrik no need to do additional verification
Description
Since https://github.com/quarkusio/quarkus/pull/43862 the build-time RSS usage of
native-image
builds have increased in themicrometer-prometheus
integration test (note that I haven't observed similar changes in other tests).The change didn't have any negative effects to the native executable size and the reachable types/methods, on the contrary it decreased the total image size by ~400KB.
The purpose of this issue is to bring awareness of this increase and keep track of it.
Note from @gsmet: here is another report posted by @mocenas here: that comes with a reproducer: https://github.com/quarkusio/quarkus/issues/44436
Compiling quarkus app into native executable using docker image, requires a lot more memory since quarkus is moving to
@ConfigMapping
.We have an example where our testing app used to work with xmx=5g for native compilation a month ago. But since https://github.com/quarkusio/quarkus/pull/43862 and https://github.com/quarkusio/quarkus/pull/44140 requires at least 9g to successfully compile (otherwise compilation fails on OutOfMemory).
Increasing memory requirement to almost double, just by moving to
@ConfigMapping
is alarming.At this moment it is somehow manageable, but with more planned in https://github.com/quarkusio/quarkus/pull/44079 it is hard to say, how much memory will it require at the end.