eclipse-openj9 / openj9

Eclipse OpenJ9: A Java Virtual Machine for OpenJDK that's optimized for small footprint, fast start-up, and high throughput. Builds on Eclipse OMR (https://github.com/eclipse/omr) and combines with the Extensions for OpenJDK for OpenJ9 repo.
Other
3.27k stars 720 forks source link

Shared cache hints for GC heap size #3743

Closed pshipton closed 5 years ago

pshipton commented 5 years ago

Remembering the previous heap size settings (-Xmn, -Xmo) after startup can provide a significant startup benefit in subsequent runs, and can improve footprint as well. The GC data can be stored in the shared cache as a hint. https://unbscholar.lib.unb.ca/islandora/object/unbscholar%3A8100/datastream/PDF/view https://ieeexplore.ieee.org/document/8121911

Since a shared cache can be used to run more than one application, which may have different heap requirements, the hint should be associated with an application, or at least a main class.

As by design the GC is initialized before the shared cache, new GC APIs will be needed to adjust the GC heap parameters after initialization. Since the heap parameters can be adjusted before any objects are created, it can happen with very low cost.

Doc issue https://github.com/eclipse/openj9-docs/issues/324

pshipton commented 5 years ago

@vijaysun-omr @mpirvu @amicic @hangshao0 fyi

hangshao0 commented 5 years ago

I guess we are doing this only on gencon ?

vijaysun-omr commented 5 years ago

Maybe balanced GC too ? We have'nt done any experimenting with balanced GC to say if/how much it helps but I'd like to think that we need to stop treating balanced as a second class citizen if it's not a huge amount of extra work.

hangshao0 commented 5 years ago

Do we have -Xmn, -Xmo on balanced GC ? @amicic https://www.ibm.com/support/knowledgecenter/SSYKE2_8.0.0/openj9/xmn/index.html https://www.ibm.com/support/knowledgecenter/SSYKE2_8.0.0/openj9/xmo/index.html

pshipton commented 5 years ago

bin/java -Xgcpolicy:balanced -verbose:sizes

  -Xmns8M         initial new space size
  -Xmnx128M       maximum new space size

  -Xmos8M         initial old space size
  -Xmox512M       maximum old space size
amicic commented 5 years ago

-Xmn is relevant in Balanced as much as in Gencon (although they have a bit differen meaning: total Nursery for Gencon which is both Allocate and Survivor vs only Allocate (actually called Eden) in Balanced). Bottom line, we should treat them same (we should set Xmns based on recommendation stored in SC)

I thought that -Xmo had no semantical meaning in Balanced, but we do seem to obey the command and do something about it. I just did a quick test and what I can tell is that effectively ends up affecting the total heap sizing (pretty much acting as Xmx/Xms commands). See for example this:

./java -verbose:gc,sizes -Xmx64M -Xmos32M -Xgcpolicy:gencon

-Xmns10880K initial new space size -Xmnx16M maximum new space size -Xms43648K initial memory size -Xmos32M initial old space size -Xmox54656K maximum old space size
-Xmx64M memory maximum

vs

./java -verbose:gc,sizes -Xmx64M -Xmos32M -Xgcpolicy:balanced

-Xmns16M initial new space size -Xmnx16M maximum new space size -Xms32M initial memory size -Xmos32M initial old space size -Xmox64M maximum old space size -Xmx64M memory maximum

While, having different meaning for Gencon and Balanced (old space vs all space), I think it would still be ok to use it.

In short, -Xmns and Xmos could be used for both GC policies to uniquely determine initial total heap and division between Nursery/Eden vs the rest of the heap.

dmitripivkine commented 5 years ago

However options should be applied wisely because it might depend on GC policy they were taken. Should stored values be accompanied by GC policy or we can make them invariant?

pshipton commented 5 years ago

Yes, we'll have to work out all the relevant environment options that need to match. Or we could store the entire command line.

DanHeidinga commented 5 years ago

We should design the data record to be extensible in the future. I think there are other values that would be interesting to store from run to run.

mpirvu commented 5 years ago

Another point to consider: Most of the information stored in SCC is read only, one notable exception being the JIT hints. For most flexibility it would be best to allow RW access to these GC hints.

hangshao0 commented 5 years ago

We can store the main class name and GC policy. If we want to store Xmx, we might want to store Xms as well. There could be so many combinations of Xmx and Xms.

hangshao0 commented 5 years ago

Probably we should turn this feature off if user specified any of -Xmns/-Xmnx/-Xmn/-Xmos/-Xmox/-Xmo/-Xms/-Xmx. I guess -Xmoi, -Xmine, -Xmaxe also matter here ?

pshipton commented 5 years ago

If we want to store Xmx, we might want to store Xms as well. There could be so many combinations of Xmx and Xms.

The suggestion to store -Xmx was for the purpose of validating the hint. i.e. a different -Xmx invalidates the hint. However, I'm leaning towards storing the entire command line. As long as the command line remains the same, the gc sizing hints for that command line remain valid. There can be different hints stored for different command lines. I think storing the entire command line should be the first approach. Afterwards we could consider filtering out specific command line options as not being relevant to the GC hints, but not sure this is a necessary feature. During production I expect the command lines don't change.

Probably we should turn this feature off if user specified any of -Xmns/-Xmnx/-Xmn/-Xmos/-Xmox/-Xmo

Agreed. If the values are set explicitly then they shouldn't be overridden.

hangshao0 commented 5 years ago

Here is an example of the entire command line options if I run a simple app:

java -verbose:init --module-path /bluebird/builds/bld_403141/jvmtest/test/SE90/functional/cmdLineTests/utils/utils.jar -m utils/org.openj9.test.ivj.Hanoi 2

Option 0 optionString="-Xoptionsfile=/team/hangshao/JVM29/jvmxa6490/lib/options.default" extraInfo=(nil) from environment variable ="N/A"
Option 1 optionString="-Xlockword:mode=default,noLockword=java/lang/String,noLockword=java/util/MapEntry,noLockword=java/util/HashMap$Entry,noLockword=org/apache/harmony/luni/util/ModifiedMap$Entry,noLockword=java/util/Hashtable$Entry,noLockword=java/lang/invoke/MethodType,noLockword=java/lang/invoke/MethodHandle,noLockword=java/lang/invoke/CollectHandle,noLockword=java/lang/invoke/ConstructorHandle,noLockword=java/lang/invoke/ConvertHandle,noLockword=java/lang/invoke/ArgumentConversionHandle,noLockword=java/lang/invoke/AsTypeHandle,noLockword=java/lang/invoke/ExplicitCastHandle,noLockword=java/lang/invoke/FilterReturnHandle,noLockword=java/lang/invoke/DirectHandle,noLockword=java/lang/invoke/ReceiverBoundHandle,noLockword=java/lang/invoke/DynamicInvokerHandle,noLockword=java/lang/invoke/FieldHandle,noLockword=java/lang/invoke/FieldGetterHandle,noLockword=java/lang/invoke/FieldSetterHandle,noLockword=java/lang/invoke/StaticFieldGetterHandle,noLockword=java/lang/invoke/StaticFieldSetterHandle,noLockword=java/lang/invoke/IndirectHandle,noLockword=java/lang/invoke/InterfaceHandle,noLockword=java/lang/invoke/VirtualHandle,noLockword=java/lang/invoke/PrimitiveHandle,noLockword=java/lang/invoke/InvokeExactHandle,noLockword=java/lang/invoke/InvokeGenericHandle,noLockword=java/lang/invoke/VarargsCollectorHandle,noLockword=java/lang/invoke/ThunkTuple" extraInfo=(nil) from environment variable ="N/A"
Option 2 optionString="-Xjcl:jclse9_29" extraInfo=(nil) from environment variable ="N/A"
Option 3 optionString="-Dcom.ibm.oti.vm.bootstrap.library.path=/team/hangshao/JVM29/jvmxa6490/lib/amd64/compressedrefs:/team/hangshao/JVM29/jvmxa6490/lib/amd64" extraInfo=(nil) from environment variable ="N/A"
Option 4 optionString="-Dsun.boot.library.path=/team/hangshao/JVM29/jvmxa6490/lib/amd64/compressedrefs:/team/hangshao/JVM29/jvmxa6490/lib/amd64" extraInfo=(nil) from environment variable ="N/A"
Option 5 optionString="-Djava.library.path=/team/hangshao/JVM29/jvmxa6490/lib/amd64/compressedrefs:/team/hangshao/JVM29/jvmxa6490/lib/amd64:/usr/local/cuda-5.5/lib64:.:/usr/lib64:/usr/lib" extraInfo=(nil) from environment variable ="N/A"
Option 6 optionString="-Djava.home=/team/hangshao/JVM29/jvmxa6490" extraInfo=(nil) from environment variable ="N/A"
Option 7 optionString="-Duser.dir=/team/hangshao/JVM29/jvmxa6490/bin" extraInfo=(nil) from environment variable ="N/A"
Option 8 optionString="-Djava.runtime.version=pxa6490ea-20170614_01" extraInfo=(nil) from environment variable ="N/A"
Option 9 optionString="-verbose:init" extraInfo=(nil) from environment variable ="N/A"
Option 10 optionString="--module-path=/bluebird/builds/bld_403141/jvmtest/test/SE90/functional/cmdLineTests/utils/utils.jar" extraInfo=(nil) from environment variable ="N/A"
Option 11 optionString="-Djdk.module.main=utils" extraInfo=(nil) from environment variable ="N/A"
Option 12 optionString="-Dsun.java.command=utils/org.openj9.test.ivj.Hanoi 2" extraInfo=(nil) from environment variable ="N/A"
Option 13 optionString="-Dsun.java.launcher=SUN_STANDARD" extraInfo=(nil) from environment variable ="N/A"
Option 14 optionString="-Dsun.java.launcher.pid=21548" extraInfo=(nil) from environment variable ="N/A"
Option 15 optionString="_org.apache.harmony.vmi.portlib" extraInfo=0x7f278000ca20 from environment variable ="N/A"

There are so many default options prepended/appended, which are not related to this feature at all. The old space and new space sizes are only two numbers, but the entire CML is such a long string. I guess it is not worth storing the whole CML, Also there is an option -Dsun.java.launcher.pid, which will be different from run to run.

If GC is going to check for the presence of -Xmns/-Xmnx/-Xmn/-Xmos/-Xmox/-Xmo/-Xmx/-Xms/... and gc policy to decide whether to turn off this feature, storing the main class probably should be sufficient.

pshipton commented 5 years ago

Just the main class isn't sufficient, other parameters such as -Xmx, -Xms and gcpolicy need to be stored as well. If they change then the hint is invalidated. We can try to find a balance between parameters that matter and parameters that don't. We can filter all the default options out of the parameter list and maybe some other specific parameters as well, but in general I think if options are modified then the behavior of the app can change and invalidate the hint.

hangshao0 commented 5 years ago

I will save the new/old space sizes as well as the following info: main module/main class (sun.java.command) -Xgc and -Xgcpolicy -Xmx -Xms -Xsoftmx -Xmoi java.class.path jdk.module.path

Do you see any GC options that are missing here ? @amicic @dmitripivkine

amicic commented 5 years ago

As @pshipton suggested, I would not even try to recognize various -Xm? options (or any other option). If anything in options changed, the hints would be invalidated. It would be, in general, complicated to try to interpret -Xm? options to validate that they effectively mean the same initial/total heap sizing, even though the option themselves are different.

For example, these two things are effectively same: 1) -Xmx4G -Xms4G 2) -Xmx4G -Xms4G -Xmn1G but I would still invalidate the hints.

The hints for initial heap sizing would come from internal API and have no relationship with the options. We are yet to agree what API is to be used, but effectively we need two:

dmitripivkine commented 5 years ago

I believe we need keep GC policy as well

DanHeidinga commented 5 years ago

If anything in options changed, the hints would be invalidated.

+1 this to approach. For the initial implementation, we should just validate the commandline is the same.

hangshao0 commented 5 years ago

I believe we need keep GC policy as well

Yes, I should say -Xgc and -Xgcpolicy

hangshao0 commented 5 years ago

OK. Then I am going the save the entire command line.

DanHeidinga commented 5 years ago

@hangshao0 Any update on this? We're getting close to the reality check date for the 0.12.0 milestone

hangshao0 commented 5 years ago

Any update on this?

Still writing the code. I guess it might take me 1 - 2 more days for the first set of VM code change. Once it is reviewed and merged, GC needs to change their code. Then VM can start storing the GC hints and enable this feature.

hangshao0 commented 5 years ago

I think we need some new tests as well.

pshipton commented 5 years ago

Note the shared cache part of this #3908 is merged

DanHeidinga commented 5 years ago

GC changes are in #4168

pshipton commented 5 years ago

The feature isn't enabled by default so I think we should keep this issue open, however the work is completed for the 0.12 release, in particular since the shared cache is no longer enabled by default.

vijaysun-omr commented 5 years ago

What is the option to enable this feature with a v0.12 OpenJ9 build ? We would like to try it on some tests to see the behaviour.

pshipton commented 5 years ago

@vijaysun-omr see #4168 for details -XXgc:heapSizeStatupHintWeightNewValue= -XXgc:heapSizeStatupHintConservativeFactor= You also need to enable shared classes, either normally or via -Xshareclasses:bootClassesOnly

amicic commented 5 years ago

@vijaysun-omr In short, just add -XXgc:heapSizeStatupHintWeightNewValue=80.

Longer story: 1) the option is to be renamed soon (https://github.com/eclipse/openj9/pull/4240) 2) we'll likely introduce another option that will be documented just to enable/disable the feature 3) Balanced GC is not covered yet 4) we need to check/test if there is a race at startup between expand-due-to-hints vs first GC, and potentially take some remedies 5) we continuously update the hints on every restart. Hint values typically continue growing, and it may take a few updates to converge to a stable value (this is what we need feedback for) 6) updates acquire a global SC lock and there might be contention in case of high number of VMs being started at about the same time. If there are real life problems we may limit the number of updates or remove them altogether (just create the hint on first start and never update). 7) in generational configuration, Tenure and Nursery hints are independently maintained. There is sort of an anomaly that Nursery hint is aggressively growing (on restarts), while Tenure hint grows on first restart buy may continue to decline on subsequent ones. This is because with large Nursery all early created object end up staying for long time there, making first few Scavenges expensive and wanting to expand even more. On the other side, it takes longer time before we start Tenuring and there is less need to expand Tenure. Perhaps this could be compensated with decreasing Tenure age threshold with larger Nursery (initial threshold is always 10, no matter what heap size is), but is sort of a independent issue to investigate.

amicic commented 5 years ago

Note also that command line option must fully match for hints to be taken into account. It also includes implicit options induced by JCL. It's been observed that order of these options may change (probably due to some startup race), which would silently ignore the existing hint and would update them with completely fresh values. Then it will take again a few more restarts for them to converge (assuming the order of options stay same).

With multiple VM starting at the same time, values will not converge. It must be a VM run that will read values (and expand on them) after the previous run updated the values for the process of converging to make progress. For simultaneous VM starts, all of them read the same value, and all of them will update with their new value, but effectively only the slowest/last one that updated will 'win'.

SueChaplain commented 5 years ago

If we are having -XX options, presumably this requires documentation? If so, please add the doc:externals label.

amicic commented 5 years ago

[EDIT: non-issue] Another issue to think about is how to deal with heterogeneous multi JVM environment. Take a simple scenario of master and slave JVM.

They will certainly have different command line options, and even though there might be single anonymous shared cache, then while starting slave (after starting master) the hints will not be used.

If both master and slave are restated at some point later (and brought up in same order - master first and slave second), again, no hints will be used at any of the restarts.

If only one of them is being restarted (more likely slave), hints will be used, but this is not very realistic scenario.

Simple solutions are:

pshipton commented 5 years ago

I don't understand the following statement. Why aren't hints saved and used on restart?

If both master and slave are restated at some point later (and brought up in same order - master first and slave second), again, no hints will be used at any of the restarts.

amicic commented 5 years ago

Hints are saved always, but not used because of command line mismatch. When master and slave are started in this order, and slave is the last one that saved the hint and then both are restarted in the same order, then master will try to read the hints that slave saved and ignore them. Then shortly after when slave also is brought up it may not be able to read them, if master just updated them.

pshipton commented 5 years ago

This isn't (or shouldn't be) the way it works. Every different set of command lines has it's own hints, which can be used independently. i.e. master will use the hints for the master command line, and similarly the slave.

amicic commented 5 years ago

Agreed, it would be very nice to have that capability, but I don't think that's in place, is it?

pshipton commented 5 years ago

It should already work that way. Unless you have seen different behavior?

amicic commented 5 years ago

On closer inspection it does seem to work. I apologize for the confusion.

vijaysun-omr commented 5 years ago

Is there some limit on the number of different command lines that we can store GC hints for in a given SCC ? fyi @mpirvu because we were discussing this today.

pshipton commented 5 years ago

@vijaysun-omr the only limit is the size of the cache. We wondered if we needed to add a limit, but not sure if it is necessary. Likely there will be something that changes the command line all the time, but otherwise there probably aren't that many variations.

pshipton commented 5 years ago

It's been observed that order of these options may change (probably due to some startup race)

@amicic what options changed? Something to do with the VM that we can fix, or something to do with the application?

vijaysun-omr commented 5 years ago

@pshipton we (Marius and I) were also wondering about the same thing, i.e. potential issues with storing as many distinct paths as can be fit. One case that we have seen change the path (directory where java is installed) continuously is Hadoop when the master installs workers on several machines repeatedly and uses slightly different directory names every time.

pshipton commented 5 years ago

In a case like that there is no benefit to storing the hints. It would be better to either disable them, or only store a subset of the command line which doesn't change.

Do you have an example Hadoop command line?

mpirvu commented 5 years ago

I didn't keep the Hadoop command line but the pattern is that the master sends the jar files through the network to the slaves, and the slaves create a temporary directory and copy the jar files there (example: /data6_p2/bi18/mapred/local/taskTracker/biadmin18/jobcache/job_201403251225_0001/jars) Hadoop was the reason we added the -Xshareclasses:restrictClassPaths option.

pshipton commented 5 years ago

For this specific case, we could store a subset of the command line when -Xshareclasses:restrictClassPaths is specified, and ignore the -cp option.

SueChaplain commented 5 years ago

Please open an issue at the docs repo when you have agreed on the externals for this item: https://github.com/eclipse/openj9-docs/issues/new?template=new-documentation-change.md

pshipton commented 5 years ago

@amicic we'd like to ensure this gets delivered for the 0.15 milestone. Do you have an outlook for completion?

DanHeidinga commented 5 years ago

@amicic gentle ping. Can you update this with the outlook?

amicic commented 5 years ago

With some luck there is not much work left:

Balanced GC will not be covered in the first release.