eclipse-openj9 / openj9

Eclipse OpenJ9: A Java Virtual Machine for OpenJDK that's optimized for small footprint, fast start-up, and high throughput. Builds on Eclipse OMR (https://github.com/eclipse/omr) and combines with the Extensions for OpenJDK for OpenJ9 repo.
Other
3.28k stars 720 forks source link

Liberty Daytrader7 startup performance gap between java8 and java11 #11015

Open harryyu1994 opened 4 years ago

harryyu1994 commented 4 years ago
Results for JDK=/home/harryayu2/sdks/java11_new jvmOpts=-Xmx512m -Xshareclasses:name=liberty -Xscmx500m -Xscmaxaot200m
StartupTime avg=3946    min=3644    max=4141    stdDev=103.6    maxVar=13.6%    confInt=0.70%   samples= 40
Footprint   avg=257620  min=228896  max=281172  stdDev=16879.8  maxVar=22.8%    confInt=1.74%   samples= 40
CThreadTime avg= 926    min= 753    max=1212    stdDev=109.0    maxVar=61.0%    confInt=3.13%   samples= 40
ProcessTime avg=6915    min=6150    max=8020    stdDev=488.0    maxVar=30.4%    confInt=1.88%   samples= 40
Stats for cold run:
StartupTime avg=10132   min=10029   max=10210   stdDev=76.4 maxVar=1.8% confInt=0.72%   samples=  5
Footprint   avg=277718  min=271576  max=281340  stdDev=3781.2   maxVar=3.6% confInt=1.30%   samples=  5
CThreadTime avg=8176    min=8135    max=8199    stdDev=28.2 maxVar=0.8% confInt=0.41%   samples=  4
    Outlier values:  8303
ProcessTime avg=21174   min=21020   max=21390   stdDev=149.1    maxVar=1.8% confInt=0.67%   samples=  5
mpirvu commented 3 years ago

Most modules show increased number of ticks with Java11: VM dll +500 (+150 from interpreter) JIT code: +220 GC: +200 (possibly more objects being allocated ?) Kernel: +220 Interestingly, this time, the JIT dll has less ticks in Java11.

To me this looks like larger pathlength overall.

mpirvu commented 3 years ago

One other thing we can try is to enable SVM during start-up. This is currently disabled because it creates bigger AOT bodies so it needs a larger SCC. In these experiments we are using a large SCC, so maybe it will provide some advantage.

harryyu1994 commented 3 years ago

Java8

Results for JDK=/home/harryayu2/sdks/java8 jvmOpts=-Xmx512m -Xshareclasses:name=liberty -Xscmx200m -Xscmaxaot80m pid=0
StartupTime avg=3597    min=3403    max=3820    stdDev=102.1    maxVar=12.2%    confInt=0.76%   samples= 40
Footprint   avg=290212  min=270768  max=307344  stdDev=7042.0   maxVar=13.5%    confInt=0.69%   samples= 35
    Outlier values:  225696 239956 233140 225588 240432
CThreadTime avg=1683    min=1318    max=2532    stdDev=262.0    maxVar=92.1%    confInt=4.37%   samples= 36
    Outlier values:  3889 3297 2663 3428
ProcessTime avg=7164    min=6440    max=7680    stdDev=309.5    maxVar=19.3%    confInt=1.18%   samples= 38
    Outlier values:  10020 8570
Stats for cold run:
StartupTime avg=17073   min=16685   max=17342   stdDev=256.7    maxVar=3.9% confInt=1.43%   samples=  5
Footprint   avg=351838  min=335312  max=395768  stdDev=26002.7  maxVar=18.0%    confInt=7.05%   samples=  5
CThreadTime avg=60718   min=58886   max=62083   stdDev=1270.0   maxVar=5.4% confInt=1.99%   samples=  5
ProcessTime avg=75816   min=74490   max=76530   stdDev=824.8    maxVar=2.7% confInt=1.04%   samples=  5

base address                         = 0x00007F0E17859000
end address                          = 0x00007F0E24000000
allocation pointer                   = 0x00007F0E1A2467D8

cache layer                          = 0
cache size                           = 209714592
softmx bytes                         = 209714592
free bytes                           = 97384192
Reserved space for AOT bytes         = -1
Maximum space for AOT bytes          = 83886080
Reserved space for JIT data bytes    = -1
Maximum space for JIT data bytes     = -1
Metadata bytes                       = 2440196
Metadata % used                      = 2%
Class debug area size                = 16744448
Class debug area used bytes          = 7205370
Class debug area % used              = 43%

ROMClass bytes                       = 43964376
AOT bytes                            = 48346200
mpirvu commented 3 years ago

Is it better with SVM during start-up? The gap used to be 15% and now I see it more like 10%. Probably the correct comparison is Java11 with and without SVM.

harryyu1994 commented 3 years ago

Java8 might have increased with SVM. Measuring Java11 with/without SVM right now.

harryyu1994 commented 3 years ago
harryyu1994 commented 3 years ago

Warm runs

Java8

<Loaded com/ibm/ws/kernel/instrument/EnvCheck>
<  Class size 1843; ROM size 1320; debug size 0>
<  Read time 0 usec; Load time 60 usec; Translate time 203 usec>
<Loaded com/ibm/ws/kernel/boot/cmdline/EnvCheck>
<  Class size 2626; ROM size 1656; debug size 0>
<  Read time 1 usec; Load time 33 usec; Translate time 61 usec>
Starting server dt7.
Server dt7 started with process ID 1593149.

Java11

JImage interface is using jimage library
<Loaded jdk/internal/module/SystemModuleFinders$SystemModuleReader$$Lambda$1/0x0000000000000000>
<  Class size 858; ROM size 1048; debug size 0>
<  Read time 0 usec; Load time 30 usec; Translate time 208 usec>
<Loaded jdk/internal/loader/BuiltinClassLoader$$Lambda$2/0x0000000000000000>
<  Class size 820; ROM size 1008; debug size 0>
<  Read time 0 usec; Load time 8 usec; Translate time 5419 usec>
<Loaded com/ibm/ws/kernel/instrument/EnvCheck>
<  Class size 1843; ROM size 1328; debug size 0>
<  Read time 0 usec; Load time 34 usec; Translate time 62 usec>
<Loaded com/ibm/ws/kernel/boot/cmdline/EnvCheck>
<  Class size 2626; ROM size 1640; debug size 0>
<  Read time 0 usec; Load time 32 usec; Translate time 82 usec>
<Loaded java/util/regex/Pattern$BitClass$$Lambda$3/0x0000000000000000>
<  Class size 554; ROM size 744; debug size 0>
<  Read time 0 usec; Load time 7 usec; Translate time 48 usec>
<Loaded java/util/regex/Pattern$$Lambda$4/0x0000000000000000>
<  Class size 666; ROM size 856; debug size 0>
<  Read time 0 usec; Load time 8 usec; Translate time 26 usec>
<Loaded java/util/regex/Pattern$CharPredicate$$Lambda$5/0x0000000000000000>
<  Class size 620; ROM size 816; debug size 0>
<  Read time 0 usec; Load time 7 usec; Translate time 81 usec>
<Loaded java/util/regex/Pattern$$Lambda$6/0x0000000000000000>
<  Class size 535; ROM size 728; debug size 0>
<  Read time 0 usec; Load time 7 usec; Translate time 30 usec>
<Loaded java/util/regex/Pattern$$Lambda$7/0x0000000000000000>
<  Class size 369; ROM size 536; debug size 0>
<  Read time 0 usec; Load time 5 usec; Translate time 26 usec>
<Loaded java/util/regex/CharPredicates$$Lambda$8/0x0000000000000000>
<  Class size 546; ROM size 736; debug size 0>
<  Read time 0 usec; Load time 20 usec; Translate time 29 usec>
<Loaded java/util/regex/Pattern$CharPredicate$$Lambda$9/0x0000000000000000>
<  Class size 779; ROM size 984; debug size 0>
<  Read time 0 usec; Load time 8 usec; Translate time 27 usec>
<Loaded java/util/ResourceBundle$ResourceBundleControlProviderHolder$$Lambda$10/0x0000000000000000>
<  Class size 477; ROM size 648; debug size 0>
<  Read time 0 usec; Load time 7 usec; Translate time 239 usec>
<Loaded java/lang/ModuleLayer$$Lambda$11/0x0000000000000000>
<  Class size 683; ROM size 872; debug size 0>
<  Read time 0 usec; Load time 11 usec; Translate time 175 usec>
<Loaded java/util/ResourceBundle$ResourceBundleControlProviderHolder$$Lambda$12/0x0000000000000000>
<  Class size 462; ROM size 632; debug size 0>
<  Read time 0 usec; Load time 6 usec; Translate time 26 usec>
<Loaded java/util/stream/Collectors$$Lambda$13/0x0000000000000000>
<  Class size 360; ROM size 528; debug size 0>
<  Read time 0 usec; Load time 6 usec; Translate time 301 usec>
<Loaded java/util/stream/Collectors$$Lambda$14/0x0000000000000000>
<  Class size 417; ROM size 592; debug size 0>
<  Read time 0 usec; Load time 6 usec; Translate time 32 usec>
<Loaded java/util/stream/Collectors$$Lambda$15/0x0000000000000000>
<  Class size 512; ROM size 688; debug size 0>
<  Read time 0 usec; Load time 6 usec; Translate time 28 usec>
<Loaded java/util/stream/Collectors$$Lambda$16/0x0000000000000000>
<  Class size 417; ROM size 584; debug size 0>
<  Read time 0 usec; Load time 6 usec; Translate time 26 usec>
<Loaded java/util/ResourceBundle$$Lambda$17/0x0000000000000000>
<  Class size 596; ROM size 784; debug size 0>
<  Read time 0 usec; Load time 8 usec; Translate time 29 usec>
<Loaded java/util/ResourceBundle$Control$$Lambda$18/0x0000000000000000>
<  Class size 687; ROM size 880; debug size 0>
<  Read time 0 usec; Load time 9 usec; Translate time 28 usec>
Starting server dt7.
<Loaded jdk/net/LinuxSocketOptions$$Lambda$62/0x0000000000000000>
<  Class size 409; ROM size 576; debug size 0>
<  Read time 0 usec; Load time 12 usec; Translate time 43 usec>
Server dt7 started with process ID 1597079.
harryyu1994 commented 3 years ago
hangshao0 commented 3 years ago
this one takes up majority of the time

<Loaded jdk/internal/loader/BuiltinClassLoader$$Lambda$2/0x0000000000000000>

For anon classes, we store them in the SCC hoping that we could get the AOT benefit from it. We won't get the class loading benefit if they are in the SCC. In fact class Translate will be slower because of additional comparisons. If the benefit of AOT is not big enough to offset the additional Translate time, we should just disable storing anon classes in the SCC .

I see it is all anon classes listed for java 11 in https://github.com/eclipse/openj9/issues/11015#issuecomment-729253847. So it is worth trying -XX:-ShareAnonymousClasses (not sure if it has been tried or not). Probably also worth trying -XX:-ShareUnsafeClasses.