eclipse-openj9 / openj9

Eclipse OpenJ9: A Java Virtual Machine for OpenJDK that's optimized for small footprint, fast start-up, and high throughput. Builds on Eclipse OMR (https://github.com/eclipse/omr) and combines with the Extensions for OpenJDK for OpenJ9 repo.
Other
3.28k stars 722 forks source link

Java Launcher is loaded at 0x80000000 on zLinux #7115

Closed dmitripivkine closed 4 years ago

dmitripivkine commented 5 years ago

I noticed that Java Launcher is loaded at address 0x80000000 on zLinux:

00010000-03210000 rw-p 00000000 00:00 0
03210000-0c810000 ---p 00000000 00:00 0
80000000-80001000 r-xp 00000000 5e:01 1057257                            /tmp/jdk-11.0.4+11/bin/java
80002000-80003000 r--p 00001000 5e:01 1057257                            /tmp/jdk-11.0.4+11/bin/java
80003000-80004000 rw-p 00002000 5e:01 1057257                            /tmp/jdk-11.0.4+11/bin/java
bca91000-bcab2000 rw-p 00000000 00:00 0                                  [heap]
bcab2000-bcac0000 ---p 00000000 00:00 0                                  [heap]
bcac0000-bd0c0000 rw-p 00000000 00:00 0
bd0c0000-156b30000 ---p 00000000 00:00 0
156b30000-156d30000 rw-p 00000000 00:00 0
156d30000-156d31000 ---p 00000000 00:00 0

This splits virtual memory below 4GB bar to two halves ~1.8GB and ~1GB. The lack of contiguous memory prevents to run Compressed References JVM with heap larger then 1.8GB with most performant 0-bit shift.

The IBM Java 8 for instance is free from this problem and Java Launcher is loaded at 0x 00060000:

00060000-00061000 r-xp 00000000 5e:01 922839                             /tmp/jre/bin/java
00061000-00062000 r--p 00000000 5e:01 922839                             /tmp/jre/bin/java
00062000-00063000 rw-p 00001000 5e:01 922839                             /tmp/jre/bin/java
00063000-03263000 rw-p 00000000 00:00 0 
03263000-0c870000 ---p 00000000 00:00 0 
0c870000-0ce70000 rw-p 00000000 00:00 0 
0ce70000-2c670000 ---p 00000000 00:00 0 
2c670000-2c870000 rw-p 00000000 00:00 0 
2c870000-2c872000 ---p 00000000 00:00 0 
3a7e8000-3a809000 rw-p 00000000 00:00 0                                  [heap]
3ff48000000-3ff48021000 rw-p 00000000 00:00 0 

and can run with ~3GB heap below 4GB bar.

I believe something is missed in Java 11 build process to assign loading address for Java Launcher to be lower.

smlambert commented 5 years ago

@pshipton - is there someone to look at this, would it be someone from CL team?

pshipton commented 5 years ago

@andrew-m-leonard can someone take a look please. You might contact the IBM JCL team for how this problem was resolved.

M-Davies commented 5 years ago

@pshipton Looking into this

M-Davies commented 5 years ago

@pshipton I've been digging through the old mercurial commits on rt-patch and found one that sets up the memory space available to AIX systems when building the launcher. This code should be in https://github.com/ibmruntimes/openj9-openjdk-jdk11/blob/cc4272e86c12e635710cca2a4c5833c37e398c7b/src/java.base/unix/native/libjli/java_md_solinux.c#L1 but isn't and I'm wondering if this is the cause.

It uses LDR_CNTRL (https://www.ibm.com/support/knowledgecenter/en/SSYKE2_8.0.0/com.ibm.java.vm.80.doc/docs/j9_configure_aix_ldr_cntrl.html) to decide if it needs to calculate how much memory should be allocated to it automatically or use a manually specified environment variable value as the memory size MAXDATA. I can zip up the code snippet in a file and include it here if it makes it easier for you to understand, however it does not come with a license

M-Davies commented 5 years ago

I also found another potential cause at https://github.com/AdoptOpenJDK/openjdk-build/blob/0a03a3ecec069613e80150f75aedd43d2869d668/build-farm/platform-specific-configurations/aix.sh#L34 which sets MAXDATAto 0x80000000

pshipton commented 5 years ago

The doc says that LDR_CNTRL is specific to 32-bit, so I don't think that's it.

M-Davies commented 5 years ago

There is a 64-bit defined section within this file. This uses ulimit and a rlimit64 struct to organise the memory (https://www.gnu.org/software/libc/manual/html_node/Limits-on-Resources.html). Will start picking this apart to make sure I haven't missed anything there

M-Davies commented 5 years ago

64-bit appears to be normal too. Program gets the current limit and maximum limit of data (using RLIMIT_DATA), checks if it is infinite and if not, attempts to set it to infinite. Failure of this results in a warning that a hard ulimit hasn't been set to infinite and out of memory errors may occur. Will keep digging.

pshipton commented 5 years ago

I expect a link option to set the preferred base address of the code. In #7458 Julian mentioned -T or -bpT

pshipton commented 5 years ago

Maybe we could move the discussion about AIX to the AIX specific issue #7458

M-Davies commented 5 years ago

This issue is not unique to openj9. Hotspot also loads the java launcher at 0x80000000 on zlinux and at 0x00060000 on x86, even when the jdk is built on zlinux. My current hypothesis is that there is a ld missing somewhere within the build scripts that loads the launcher in the right place. I'm currently looking into where exactly this missing command is (most likely a script that's called from make/launcher/LauncherCommon.gmk)

dmitripivkine commented 5 years ago

This issue is not unique to openj9. Hotspot also loads the java launcher at 0x80000000 on zlinux and at 0x00060000 on x86, even when the jdk is built on zlinux. My current hypothesis is that there is a ld missing somewhere within the build scripts that loads the launcher in the right place. I'm currently looking into where exactly this missing command is (most likely a script that's called from make/launcher/LauncherCommon.gmk)

I am not sure Hotspot does care about this. Hotspot Compressed Refs implementation does not rely on special usage of virtual memory below 4G bar

M-Davies commented 5 years ago

RTC PR 100052 is the work item where this problem was fixed for IBM Java 8. CompileLaunchers.txt is the patch details and is not present in OpenJDK's code. I'm currently trying to build and/or run the java launcher with this patch included

M-Davies commented 5 years ago

The work items that will fix this bug are:

IBM JAVA 8: https://jazz103.hursley.ibm.com:9443/jazz/web/projects/JTC-JAT#action=com.ibm.team.workitem.viewWorkItem&id=100052

IBM JAVA 9: https://jazz103.hursley.ibm.com:9443/jazz/web/projects/JTC-JAT#action=com.ibm.team.workitem.viewWorkItem&id=105268

@pshipton Do you know what variable sets the load flags for the java launcher in JDK8/11?

pshipton commented 5 years ago

The javaw launcher (JAVAW_LDFLAGS) is only built for Windows.

For Java 8, I see $1_LDFLAGS used in CompileLaunchers.gmk For Java11+, I also see $1_LDFLAGS used in LauncherCommon.gmk

I assume this means JAVA_LDFLAGS will affect the java launcher, but really the change should be generic and change all the launchers, not just java.

M-Davies commented 5 years ago

Hmmm. There's a lot of places within the https://github.com/ibmruntimes/openj9-openjdk-jdk11/blob/e7da16be04c9cf4e6734e3621a5f40e34001de8a/make/launcher/Launcher-java.base.gmk#L1 you could plug this in. I'll trial some locations within the makefile, starting with LDFLAGS

M-Davies commented 4 years ago

One thing that has puzzled me, when running java -verbose:gc -Xmx2040m looper to check compressedRefsShift the value is 0x0, indicating that compressed refs is still able to run (even though the base memory address remains at 0x80000000)

EDIT: That was from an unmodified JDK

pshipton commented 4 years ago

0x80000000 is the base address of the java executable, which doesn't need much memory (i.e. I guess less than 8m (2048m - 2040m). Seems 2040m still fits in the available space. Try running java -verbose:gc -Xmx2040m -Xdump:java:events=vmstop and look at the "Object memory" data in the "MEMINFO subcomponent dump routine" section to see the object heap addresses. Although when I try it I get a "compressedRefsShift" of 0x1.

M-Davies commented 4 years ago

zlinux Output

zlinuxMemory

The heap size is identical to an x86_linux machine that loads the JDK at 0x40000000 (below)

x64linuxMemory
dmitripivkine commented 4 years ago

So what is your concern? you requested -Xmx2040m (2139095040 bytes) and got allocation [0x80030000, 0xff830000] (2139095040 bytes). It fits to one of halves of memory below 4G bar

M-Davies commented 4 years ago

My concern was the compressedRefsShift was not producing an accurate value. However, running java -verbose:gc -Xmx2040m -Xdump:java:events=vmstop shows that it is. The screenshots were to confirm the point

M-Davies commented 4 years ago

@pshipton @dmitripivkine https://github.com/M-Davies/openj9-openjdk-jdk11/commit/27d5060571a8905d1267664f4ac62537b9debd9c

00060000-00062000 r-xp 00000000 fd:01 14295230                           /root/SharedDocker/openj9-openjdk-jdk11/build/linux-s390x-normal-server-release/images/jdk/bin/java
00062000-00063000 r--p 00001000 fd:01 14295230                           /root/SharedDocker/openj9-openjdk-jdk11/build/linux-s390x-normal-server-release/images/jdk/bin/java
00063000-00064000 rw-p 00002000 fd:01 14295230                           /root/SharedDocker/openj9-openjdk-jdk11/build/linux-s390x-normal-server-release/images/jdk/bin/java

Fix for JDK11, please see the test output above. I currently have a build running for JDK8 that should achieve the same result

M-Davies commented 4 years ago

JDK8 fix. https://github.com/M-Davies/openj9-openjdk-jdk8/commit/1fee500193b02437cfa5f35aa8bb3f84b421add4

00060000-00061000 r-xp 00000000 fd:01 15485194                           /root/SharedDocker/openj9-openjdk-jdk8/build/linux-s390x-normal-server-release/images/j2sdk-image/bin/java
00061000-00062000 r--p 00000000 fd:01 15485194                           /root/SharedDocker/openj9-openjdk-jdk8/build/linux-s390x-normal-server-release/images/j2sdk-image/bin/java
00062000-00063000 rw-p 00001000 fd:01 15485194                           /root/SharedDocker/openj9-openjdk-jdk8/build/linux-s390x-normal-server-release/images/j2sdk-image/bin/java
pshipton commented 4 years ago

I added a number of comments to the commit.

M-Davies commented 4 years ago

@pshipton Test successfull https://github.com/M-Davies/openj9-openjdk-jdk8/commit/14cfb367e5cfd3c0a2655cecdc3761e7ee999c22 If you're happy, I can put a PR in now?

M-Davies commented 4 years ago

@smlambert Same question as above ^^

pshipton commented 4 years ago

Please open the PR, we can continue the review there.

pshipton commented 4 years ago

Note I had created a number of new comments on that later commit M-Davies/openj9-openjdk-jdk8@14cfb36

dmitripivkine commented 4 years ago

My concern was the compressedRefsShift was not producing an accurate value. However, running java -verbose:gc -Xmx2040m -Xdump:java:events=vmstop shows that it is. The screenshots were to confirm the point

I see. This is wrong indeed. The decision of Compressed Refs Shift is made based on position of most significant bit for heap top address. However I can not reproduce this problem. Calculation of Compressed Refs Shift is done in https://github.com/eclipse/omr/blob/e74d024550f3ae4472f23a984faa5c76c7e109b9/gc/base/Configuration.cpp#L268 I don't understand how it might work wrong. I need a reproducible test case for this

pshipton commented 4 years ago

@dmitripivkine why is a shift of zero wrong for a heap 0x80030000 to 0xFF830000 ?

dmitripivkine commented 4 years ago

@dmitripivkine why is a shift of zero wrong for a heap 0x80030000 to 0xFF830000 ?

According https://github.com/eclipse/openj9/issues/7115#issuecomment-553464888 shift was set to 1, was not it?

pshipton commented 4 years ago

According #7115 (comment) shift was set to 1, was not it?

Yes, but that is because the memory was allocated differently by the OS, from 0x82DC0000 to 0x1025C0000. I can duplicate it on my fyre machine, but it doesn't indicate a bug.

dmitripivkine commented 4 years ago

According #7115 (comment) shift was set to 1, was not it?

Yes, but that is because the memory was allocated differently by the OS, from 0x82DC0000 to 0x1025C0000. I can duplicate it on my fyre machine, but it doesn't indicate a bug.

Ok, thank you. I misunderstood obviously

M-Davies commented 4 years ago

@pshipton can you look at the other two requests on JDK11 and 13 too please? :)

pshipton commented 4 years ago

@M-Davies yes, they are on the list. Please create a PR for https://github.com/ibmruntimes/openj9-openjdk-jdk as well.