eclipse-openj9 / openj9

Eclipse OpenJ9: A Java Virtual Machine for OpenJDK that's optimized for small footprint, fast start-up, and high throughput. Builds on Eclipse OMR (https://github.com/eclipse/omr) and combines with the Extensions for OpenJDK for OpenJ9 repo.
Other
3.28k stars 720 forks source link

Xtune:virtualized and forceAOT configuration discussion #10960

Open dsouzai opened 4 years ago

dsouzai commented 4 years ago

I'm opening an issue to discuss in more detail some ideas briefly talked about offline.

-Xtune:virtualized

  1. Increase default SCC Size -Xtune:virtualized is a mode in which the JIT generates a lot of AOT code in order to provide very fast rampup (minimize the time between application startup and steady state) and low CPU utilization. In order to get this benefit, depending on the application, the user would very likely have to increase the size of the SCC. Perhaps to improve usability, it is better to increase the size of the SCC by default when -Xtune:virtualized is specified.

  2. Enable SVM during startup The SVM is currently disabled during startup. This is because of potential startup regressions when compared to no SVM during startup. However, from recent investigation, this is mainly because the bodies generated with SVM enabled are larger (because there is more inlining and more optimized code) resulting in less bodies stored into the SCC, which means less methods are AOT loaded during startup. Increasing the SCC removes this regression. However, if the SCC size is to be increased under -Xtune:virtualized we might as well enable SVM during startup as well.

forceAOT

  1. Increase SCC Size If it is desired to run forceAOT by default, then perhaps the default SCC size should be increased. Depending on the application being run, the default SCC can become full during startup even without forceAOT, rendering enabling forceAOT moot.

Tagging @mpirvu @vijaysun-omr for their thoughts.

pshipton commented 4 years ago

The default shared cache is already 64MB, you think it needs to be even larger? The actual default cache is is 300MB, but the soft limit is 64MB.

@hangshao0 fyi

dsouzai commented 4 years ago

When I run DT7 Liberty startup (which specifies an 80M cache) it gets full; ~30M AOT code, the rest being classes+data.

The actual default cache is is 300MB, but the soft limit is 64MB.

Does that mean that once the 64MB limit is hit, we'll keep increasing? Or does the user have to rerun the program with a larger cache size?

mpirvu commented 4 years ago

Or does the user have to rerun the program with a larger cache size?

The user can increase the soft limit up to the hard limit without destroying the existing SCC, like in the past. However, this resizing needs manual intervention as far as I know.

dsouzai commented 4 years ago

I see, then I guess the discussion should involve changing the default initial soft limit in these modes.

hangshao0 commented 4 years ago

The SCC softmax limit can be changed by -Xshareclasses:adjustsoftmx=<size> (https://www.eclipse.org/openj9/docs/xshareclasses/#adjustsoftmx-cache-utility)

mpirvu commented 4 years ago

To add some context on why we may want forceaot enablement: The main idea behind enabling -Xaot:forceAOT by default is that this option is unsupported (like any -Xjit option). When we want forceaot behavior we typically recommend customers -Xtune:virtualized which enables forceaot under the covers and is supported. The downside of -Xtune:virtualized is subdued recompilation (in order to save CPU). If the application has many cores to work with and can absorb the compilation costs, forceAOT may be a better solution than -Xtune:virtualized.

vijaysun-omr commented 4 years ago

@dsouzai to be completely clear, your point 2) to enable SVM during startup is only being proposed when -Xtune:virtualized is specified, right ?

@mpirvu, while I support the proposed increase for the SCC size used with forceAOT for reasons mentioned by Irwin in his third point, I feel the pros and cons of enabling forceAOT by default have to be clearly stated. I believe there could be some impact on steady state throughput that would need to be verified, especially in the context of having a "default SCC" (outside containers) and "embedded SCC" (inside containers) these days that may affect applications running "out of the box". Plus the steady state throughput impact would probably need to be measured not just with the current default size for the SCC but also with the proposed increased size (that could in theory, change the throughput impact). Finally, there probably could be some footprint impact as well. It may be that we still lean towards accepting forceAOT by default once we have all this data, but my main point is that we need to get a fresh set of data to move forward.

dsouzai commented 4 years ago

@dsouzai to be completely clear, your point 2) to enable SVM during startup is only being proposed when -Xtune:virtualized is specified, right ?

Yeah that's right.

hangshao0 commented 4 years ago

If we decide to increase the default (softmax) size of SCC, it is worth also doing some measurements on how much more AOT code (relative to other data in the SCC) is generated with forceAOT/-Xtune:virtualized.

mpirvu commented 4 years ago

If we decide to increase the default (softmax) size of SCC, it is worth also doing some measurements on how much more AOT code (relative to other data in the SCC) is generated with forceAOT/-Xtune:virtualized.

Absolutely! Nothing will be delivered without relevant performance measurements (start-up/rampup/footprint/throughput)