eclipse-openj9 / openj9

Eclipse OpenJ9: A Java Virtual Machine for OpenJDK that's optimized for small footprint, fast start-up, and high throughput. Builds on Eclipse OMR (https://github.com/eclipse/omr) and combines with the Extensions for OpenJDK for OpenJ9 repo.
Other
3.29k stars 722 forks source link

Building OpenJ9 for e500v2 core equipped SoC #2585

Open lmajewski opened 6 years ago

lmajewski commented 6 years ago

Dear All,

I'm trying to build OpenJ9 on the PPC SoC equipped with e500v2 core. This core doesn't have the AltiVec IP block (Instead it uses the SPE extension for floating point calculation).

The problem seems to be with the OpenJ9 assumption that all supported cores support AltiVec instructions. One of the assembly tuned files: ./openj9/runtime/compiler/p/runtime/J9PPCCRC32.spp

This is the __crc32_vpmsum [1] optimized implementation of CRC32 calculation for 16B data blocks.

Is there any C implementation of this function available? Or maybe one for SPE assembler?

Please correct me if I'm wrong, but it seems to me that one would need to:

or

Personally, I would prefer the first option with C, but I'm not sure what would be the performance impact on OpenJ9.

Has anybody tried to run OpenJ9 on e500_v2?

Thanks in advance, Łukasz

fjeremic commented 6 years ago

@gita-omr @ymanton this seems like your area of expertise. Could you help answer OP's questions?

ymanton commented 6 years ago

The problem seems to be with the OpenJ9 assumption that all supported cores support AltiVec instructions.

We only use AltiVec if we detect the processor at runtime and know that it supports AltiVec. The same applies to VSX and various other hardware features. The __crc32_vpmsum routine for example will only be called if we detected that the processor is an IBM POWER8 or later, otherwise we will not use it.

We don't detect the e500 so we will assume we are running on a basic PPC chip that has no support for AltiVec, VSX, crypto instructions, transactional memory, etc. If those sorts of instructions get executed on your chip that's a bug in the JIT that can be fixed.

lmajewski commented 6 years ago

Does it mean that the OpenJ9 shall be compiled on very basic PPC ISA if no supported architecture is detected?

Why I do ask? The guess-platform.sh script checks the system on which we do run. On Linux it seems like the x86_64, ppc64 and ppc64le are supported. The ppc (32 bit as e500_v2) is not supported out of the box.

fjeremic commented 6 years ago

The guess-platform.sh script checks the system on which we do run.

This script just attempts to guess the platform you're compiling OpenJ9 on. The compiler options (gcc or xlC) used when compiling OpenJ9 will target the minimum supported architecture level. I'm not sure what that is on Power, but presumably it is a very old processor.

What @ymanton is talking about is what happens at runtime. At runtime OpenJ9 will detect what processor you are running under and the JIT compiler will generate calls to __crc32_vpmsum for example if we detected you are running on IBM POWER8 or later.

ymanton commented 6 years ago

As @fjeremic said, guess-platform.sh is checking at build-time, not run-time. Since we don't compile OpenJ9 in 32-bit environments there is currently no support for it in the code, but feel free to add it.

If you want to port OpenJ9 to the e500 then most of your work will be in making changes to the build system to work in a 32-bit ppc environment. Once you have a successful build you shouldn't have much trouble running OpenJ9 except for one issue related to using 64-bit instructions -- we assume that ldarx and stdcx are available, which is not true on 32-bit systems so that will need to be fixed.

If you have not already seen issue #2399 please take a look at it, it discusses problems that are very similar to yours.

lmajewski commented 6 years ago

I'm not sure what that is on Power, but presumably it is a very old processor.

No, it is not. This is quite powerful embedded system; 2 cores , 1.5GHz, 1 GiB RAM. It just doesn't support altivec and has SPE instead.

I will look into the pointed thread. Thanks for reply.

lmajewski commented 6 years ago

I've started the porting.

Why: OpenJ9 claims to be much faster than other JVMs. Goal: To have OpenJ9 build on PPC (e500_v2 core).

For sake of simplicity I've decided to use zero variant (to avoid altivec issues) and build it native environment.

I've followed: https://www.eclipse.org/openj9/oj9_build.html Side question: Why gcc 4.8 is used (recommended) ? I'm using gcc 6.4.0 After having the source code (and all prerequisites) the configure passes: ./configure --with-freemarker-jar=/lib/freemarker.jar --with-jobs=2 --with-debug-level=fastdebug --without-freetype --without-x --without-cups --without-alsa --disable-headful --with-jvm-variants=zero

`==================================================== A new configuration has been successfully created in /root/openj9-openjdk-jdk8/build/linux-ppc-normal-zero-fastdebug using configure arguments '--with-freemarker-jar=/lib/freemarker.jar --with-jobs=2 --with-debug-level=fastdebug --without-freetype --without-x --without-cups --without-alsa --disable-headful --with-jvm-variants=zero'.

Configuration summary:

Tools summary:

Build performance summary:

The I've decided to build it with: make CONF=linux-ppc-normal-zero-fastdebug LOG=trace JOBS=2 images

The build errors poped up in: javac: file not found: /root/openj9-openjdk-jdk8/jdk/src/solaris/classes/sun/awt/org/xml/generator/WrapperGenerator.java [1]

This file has been appended to the end of: /root/openj9-openjdk-jdk8/build/linux-ppc-normal-zero-fastdebug/jdk/btclasses/_the.BUILD_TOOLS_batch as part of BUILD_TOOLS generation:

SetupJavaCompilation(BUILD_TOOLS) [2] SETUP := GENERATE_OLDBYTECODE [3] SRC := /root/openj9-openjdk-jdk8/jdk/make/src/classes /root/openj9-openjdk-jdk8/jdk/src/solaris/classes/sun/awt/X11/generator [4] BIN := /root/openj9-openjdk-jdk8/build/linux-ppc-normal-zero-fastdebug/jdk/btclasses Tools.gmk:38: Running shell command

When I replace /org/xml -> /X11 the file (WrapperGenerator.java) is present. Another strange thing - why AWT is build/needed at all? I've asked ./configure to build headless and without X VM.

Any idea why it is like that? Maybe some explanation, which could shed some light?

Regarding the debug infrastructure of OpenJ9 build:

Are there any other available?

fjeremic commented 6 years ago

Side question: Why gcc 4.8 is used (recommended) ?

There was work needed to get higher levels working. The JIT specifically made use of a slightly modified CRTP which work on gcc 4.8 but not on 5+ due to spec conformance. We should be able to build now with gcc 7.3 through and will be moving to that compiler level soon. See #1684.

ymanton commented 6 years ago

For sake of simplicity I've decided to use zero variant (to avoid altivec issues) and build it native environment.

I don't know how the zero parts of OpenJDK are built for OpenJ9, but OpenJ9 itself doesn't have a "zero" VM so unfortunately it will be the same as building a non-zero VM and various assembly files and the JIT will have to be built.

When I replace /org/xml -> /X11 the file (WrapperGenerator.java) is present. Another strange thing - why AWT is build/needed at all? I've asked ./configure to build headless and without X VM.

Any idea why it is like that? Maybe some explanation, which could shed some light?

I don't know if it is a bug in the OpenJDK build system or the just the OpenJ9 parts, but the --without-x flag is not respected. I just install all the needed libs and headers and build with the default config. I don't even know why a Solaris Java class is being built on other platforms, but this might also be another bug in the build system.

lmajewski commented 6 years ago

--without-x flag is not respected

Ok, So this is a dead option.

I just install all the needed libs and headers and build with the default config

I assume that you use PPC64? Have you ever tried to cross compile the OpenJ9?

Is there any way to improve the debug output? I do have a hard time to find places where the files (like _the.BUILD_TOOLS_batch) are generated.

Also please correct me if I'm wrong, but it seems to me like the ./configure is already created in the repository (and downloaded). Maybe I do need to regenerate it?

ymanton commented 6 years ago

I assume that you use PPC64? Have you ever tried to cross compile the OpenJ9?

No, OpenJ9 only builds on ppc64le, not ppc64 or ppc (the IBM JDK builds on ppc64 in both 32- and 64-bit modes). I have not tried to cross-compile OpenJ9 myself, but I think we may support that for ARM targets, but I'm not sure.

Is there any way to improve the debug output? I do have a hard time to find places where the files (like _the.BUILD_TOOLS_batch) are generated.

Unfortunately not that I know of, OpenJ9 had to make changes to the OpenJDK build system in order to integrate, but some things are still less than perfect. The only thing I can suggest is that if you're building jdk8 that you set VERBOSE="" in your env for make, which should echo commands so you can better see what's being invoked.

Also please correct me if I'm wrong, but it seems to me like the ./configure is already created in the repository (and downloaded). Maybe I do need to regenerate it?

The version that's checked in should be in sync with configure.ac, but it doesn't hurt to regenerate it. The file you care about is actually common/autoconf/configure, the top-level just calls this one.

lmajewski commented 6 years ago

I have not tried to cross-compile OpenJ9 myself, but I think we may support that for ARM targets, but I'm not sure.

Do you have maybe the build system adjustments to cross-compile the OpenJ9 on ARM? I mean the arm is also not supported (at all), so I could reuse some of its code on ppc port.

ymanton commented 6 years ago

Unfortunately I don't, I haven't spent any time on ARM. @JamesKingdon might have some info on how to get OpenJ9 to cross compile and/or some patches for that on ARM.

lmajewski commented 6 years ago

If I may ask about OMR's tools - namely tracemerge, hookgen, etc.

What is their purpose? In my native build - for example the tracemerge is used during build: ./tracemerge -majorversion 5 -minorversion 1 -root .

Why do we need to merge trace information during build? Moreover, this means that it shall be cross-compiled on the HOST (x86_64| PPC64). Why OpenJ9 needs it?

I've also noticed the OMR_CROSS_CONFIG="yes", which gives tools the possibility to be cross compiled. This might be quite useful, as omr/tools/tracegen/makefile calls: include $(top_srcdir)/tools/toolconfigure.mk

However, it seems to be tunned to PPC64 (-m64).

DanHeidinga commented 6 years ago

OMR and OpenJ9 use a trace engine to record diagnostic info on how the code is executing into a circular buffer on the thread. The descriptions of these trace points need to be converted into binary forms and then merged into a single data file that can be used by the runtime. That's roughly tracemerge.

hookgen is used to generate the appropriate macros for the low overhead pub/sub system used in OMR / OpenJ9 to communicate events across the system.

lmajewski commented 6 years ago

Ok, so those are components, which will be used by running JVM instance and hence shall be either cross-compiled of build natively.

DanHeidinga commented 6 years ago

They're only needed as part of the build and not at runtime.

lmajewski commented 6 years ago

I think that I've misunderstood you in some way.

Are they only used when the OpenJ9 is compiled (so they could be compiled as x86_64)? Or they need to be available on target (and cross compiled as PPC) ?

DanHeidinga commented 6 years ago

Sorry I wasn't clear. Most of the tools - like hookgen & tracemerge - are only used when OpenJ9 is compiled and can be compiled as x86_64.

There is one that depends on right architecture: constgen

If you support DDR (used for debugging jvm crashes), it will also need to run on the right architecture.

lmajewski commented 6 years ago

With current version of openJ9 build system (scripts) the successful configure gives following output:

Build performance tip: ccache gives a tremendous speedup for C++ recompilations. You have ccache installed, but it is a version prior to 3.1.4. Try upgrading.

The problems is that on my system: /openj9-openjdk-jdk8# ccache -V ccache version 3.2.5+dirty

Is there any workaround to fix this? Or the ./configure script logic is just wrong and the version is determined in a wrong way?

DanHeidinga commented 6 years ago

@dnakamura Any thoughts on the ccache question?

dnakamura commented 6 years ago

I believe the openjdk code assumes that the version < 3.1.4 if it fails to parse the version. IT's been a while since I looked at the relevant code, but I think they fail to parse when they seee anything other than digits or a decimal points. Will look into it

dnakamura commented 6 years ago

Ok no my bad. It will handle alphabetic characters in the version string. However to check the version number they are just matching against the regex 3.1.[456789] which means anything > 3.1.9 will fail.

lmajewski commented 6 years ago

If I may ask again the question regarding the gcc 4.8 (which is recommended for this VM native build):

I've backported the gcc 4.8.2 to my setup. Unfortunately during the ./configure execution, it wants to check if gcc is working:

configure:22215: /usr/bin/powerpc-poky-linux-gnuspe-gcc -O2 -pipe -g -feliminate-unused-debug-types -Wno-error=deprecated-declarations -fno-lifetime-dse -fno-delete-null-pointer-checks -m32 -mcpu=8548 -mabi=spe -mspe -mfloat-gprs=double - -sysroot=/ -Wl,-O1 -Wl,--hash-style=gnu -Wl,--as-needed -fPIC conftest.c >&5 powerpc-poky-linux-gnuspe-gcc: error: unrecognized command line option '-fno-lifetime-dse'

The problem is that this particular optimization option is NOT supported in 4.8.[12345]. It first shows up on 4.9 -> e.g. https://gcc.gnu.org/onlinedocs/gcc-4.9.3/gcc/Optimize-Options.html

Why it is like that? Is the '-fno-lifetime-dse' only needed on PPC (as it is possible to compile J9 on x86_64).

From the other reply -> the problem with compiling proper code only shows up on gcc 5+, so I guess that 4.9.x can be used?

ymanton commented 6 years ago

Looks like that issue comes from OpenJDK code, not OpenJ9. If you look here

https://github.com/ibmruntimes/openj9-openjdk-jdk8/blob/2b004fdb6829f287eaa464a57a8680377886ca75/common/autoconf/toolchain.m4#L1425-L1440

you'll see that they're trying to disable that opt under GCC 6, so it should not be used when you build using GCC 4.8. Is your default host compiler GCC 6 or later? Perhaps the configure scripts are invoking that in some places instead of your powerpc-poky-linux-gnuspe-gcc cross compiler and getting confused. You can look in the various config.log files that are generated to see what's going on.

dnakamura commented 6 years ago

You should also note there is a runtime check you need to disable to work on 32 bit ( see #2399 ). Note: in the issue they also discuss issues with 32 bit power missing for certain instructions, however I dont think thats an issue for the e500 cores. However you may run into other issues where bits of our code assume we are running on a 64bit chip

shingarov commented 6 years ago

Do you have maybe the build system adjustments to cross-compile the OpenJ9 on ARM? I mean the arm is also not supported (at all), so I could reuse some of its code on ppc port.

I recently followed James' instructions and successfully cross-compiled from Ubuntu/AMD64 to the RPi and the resulting VM works fine. Caveat: you may want to read the recent conversation on Slack about back-contributing directly to the master repo, not via James' fork.

I am also actively trying to cross-compile to the e500. I am approaching it differently though, I am trying to start from (pieces of) the OMR testcompiler which kind of looks more within reach. What I understood however is that its build system is quite disconnected from the other two i.e. from both TR's and J9's. And I have a feeling that it's less actively being looked at, as while the other parts cross-compile just fine, I had to dance around things to get the tc/tril/etc to cross-compile to ARM. I'll keep you posted on the progress with tc/tril on e500.

lmajewski commented 6 years ago

Thanks Boris for your input.

I recently followed James' instructions and successfully cross-compiled from Ubuntu/AMD64 to the RPi and the resulting VM works fine.

I've looked on your Github repositories and I couldn't find the ARM port for J9. Would it be possible to upload it somewhere?

Slack about back-contributing directly to the master repo, not via James' fork.

Do you have any reference/logs to those conversations?

I had to dance around things to get the tc/tril/etc to cross-compile to ARM.

Could you share the steps (or repository), which were needed on ARM to get it working?

I'll keep you posted on the progress with tc/tril on e500.

Thanks.

lmajewski commented 6 years ago

I've moved a bit further with native compilation. The gcc 4.8.2 compiles the 'images/j2re-image/bin/java' binary. However, I do experience the "Illegal instruction" aborts. One was caused by 'lwsync' not being available on e500(_v2) ISA (https://www.nxp.com/docs/en/reference-manual/E500CORERM.pdf).

This issue has been fixed by replacing 'lwsync' calls with 'sync' - mostly in OMR code generator (e500 supports msync, which probably shall be used - this will be fixed when it all starts working).

Now, I do have problem with 'cmpl' as being "Illegal Instruction": 0x0fb9808c in loop () from /mnt/openj9-openjdk-jdk8/build/linux-ppc-normal-zero-release/images/j2re-image/lib/ppc/default/libj9vm29.so (gdb) x/20i $pc-32 0xfb9806c <J9CAS8Helper+4>: ori r8,r4,0 0xfb98070 : lwarx r9,0,r12 0xfb98074 <loop+4>: rotlwi r3,r9,0 0xfb98078 <loop+8>: ori r4,r9,0 0xfb9807c <loop+12>: ori r10,r8,0 0xfb98080 <loop+16>: ori r11,r6,0 0xfb98084 <loop+20>: rlwimi r10,r5,0,0,0 0xfb98088 <loop+24>: rlwimi r11,r7,0,0,0 => 0xfb9808c <loop+28>: cmpl cr0,1,r9,r10 0xfb98090 <loop+32>: bne- 0xfb980a0 0xfb98094 <loop+36>: stwcx. r11,0,r12 0xfb98098 <loop+40>: bne+ 0xfb98070 0xfb9809c <loop+44>: blr 0xfb980a0 : stwcx. r9,0,r12 0xfb980a4 <fail+4>: bne+ 0xfb98070

This is strange as 'cmpl' is supported in the e500 ISA. Some other threads points out to check the cache line size (32B for e500) - though it would be strange, as OpenJDK8 is working on this machine with the same setup (and its 'zero' variant is used for J9 compilation).

ymanton commented 6 years ago

That's the 64-bit version of cmpl you're crashing on. You can change it to cmpl cr0,0,r9,r10 for the 32-bit instruction to go along with your changes for ldarx and stdcx., however it's incorrect and will probably give you bad results and even more mysterious crashes. We really need to exchange a full 64-bit value atomically here.

https://github.com/eclipse/omr/pull/2930 and https://github.com/eclipse/openj9/pull/2764 are patches I started putting together last week to get ppc32 working, you can give them a try instead. You can export VMDEBUG="-DOMR_NO_64BIT_LCE" in your build env to build the VM with these changes.

wyatt8740 commented 6 years ago

@ymanton This is great; thanks for sharing it. By the way, what's the copyright check problem with #2764? (I'm not used to Jenkins.) Is it missing a copyright header in one of the files?

I'll try it on my PowerBook sometime later today, time allowing (assuming the G4's ISA supports everything that your system does).

shingarov commented 6 years ago

I've looked on your Github repositories and I couldn't find the ARM port for J9.

That's the "arm" branch in JamesKingdon's repo, and he also has nice instructions here

Do you have any reference/logs to those conversations?

It's in the #general channel on Aug 9th.

Could you share the steps (or repository), which were needed on ARM to get it working?

The JVM works as-is out of James' branch. What doesn't work are the simple tests in testcompiler and tril. I care about those because the goal of these exercises for me is a riscv port and the JVM is definitely the wrong level of complexity for approaching that. So I got them to work and am trying to make the change into a nice branch that can be pulled into master but after a day of cursing I am starting to think that maybe doing it in one go is too ambitious and maybe it warrants having a little conversation in today's community call.

ymanton commented 6 years ago

@wyatt8740 there are only patches to get rid of the 64-bit CAS in that tree unfortunately. You still need to make changes to the build files in OMR, OpenJ9, and OpenJDK to get a VM built, but since @lmajewski and/or @shingarov are making progress on that I decided to tackle other things and built/ran the IBM JDK to test them. I'll see if I can find the lwsyncs @lmajewski mentioned above as well when time permits.

lmajewski commented 6 years ago

@ymanton After applying your patches I do see following error:

unix/linux/ppc/32/cas8help.s: Assembler messages: unix/linux/ppc/32/cas8help.s:74: Error: unrecognized opcode: rldimi' unix/linux/ppc/32/cas8help.s:75: Error: unrecognized opcode:rldimi' unix/linux/ppc/32/cas8help.s:77: Error: unrecognized opcode: ldarx' unix/linux/ppc/32/cas8help.s:78: Error: unrecognized opcode:cmpld' unix/linux/ppc/32/cas8help.s:80: Error: unrecognized opcode: stdcx.' unix/linux/ppc/32/cas8help.s:84: Error: unrecognized opcode:srdi'

For example the 'rldimi' in Table 3-44: https://www.nxp.com/docs/en/reference-manual/E500CORERM.pdf Is marked as PowerPC AIM specific, not available on e500.

lmajewski commented 6 years ago

I've uploaded my branches for PPC32 e500 to github: https://github.com/lmajewski/ppc32_j9_omr https://github.com/lmajewski/ppc32_j9_openj9 https://github.com/lmajewski/ppc32_j9_openj9-openjdk-jdk8

There are lukma_* files to configure it and execute - those were the same as zero variant of OpenJDK8 (which seems to work on the platform).

For those who want to build it natively - there is a qemu-system-ppc port. One can use -M ppce500 or -M mpc8544.

However up to 2 cores are supported and max memory of 512 MiB (with more RAM and cores some strange errors emerge).

qemu-system-ppc -M ppce500 -m 512M -nographic -d guest_errors \ ./arch/powerpc/boot/uImage \ -drive file=core-image-qoriq-qoriq-20180626070914.rootfs.ext2,if=virtio \ -append "root=/dev/vda rw rootwait rootfs=ext2"

It works with Linux 4.18 kernel - but IS extremely slow to compile.

ymanton commented 6 years ago

Thanks for testing. It makes sense that your assembler doesn't want to deal with unsupported instructions. I was on ppc64 so I didn't see these errors. I'll fix it shortly.

lmajewski commented 6 years ago

@ymanton As side question - is there any way to test only OMR (or other separate J9 component)?

I mean it is very time consuming to build it. The J9 makefile has make , but the smallest instance is "jvm".

shingarov commented 6 years ago

any way to test only OMR That's what the discussion on yesterday's call was about. Basically, the set of native makefiles is screwed up, they are wrong, duplicated in several places, and some of them have been neglected for a while. In the call Mark made the point that they aren't worth fixing because they will be deprecated in favour of CMake which isn't quite there yet on any platform. So in the meantime I propose that we simply push temporary branches to exchange kludges to keep going -- I'll prepare and push mine when I come back from ESUG.

lmajewski commented 6 years ago

@shingarov Yes, I also think that the goal is to make the OMR (and the whole J9) working correctly first and only then cleanup things.

ymanton commented 6 years ago

@ymanton As side question - is there any way to test only OMR (or other separate J9 component)? I mean it is very time consuming to build it. The J9 makefile has make , but the smallest instance is "jvm".

If building from the top is too painful during development you can try this shortcut for rebuilding just the VM binaries:

VERSION_MAJOR=8 \
OPENJDK_VERSION_NUMBER_FOUR_POSITIONS=8.0.0.0 \
make -C build/<your-configured-build>/vm/

(You may need to specify some additional vars in your env, I haven't used this shortcut in a while and can't check ppc at the moment.)

It will only rebuild the VM components in that directory, but it will not compose the image. Once you're done making changes you can build one of the top level targets to get an image you can run. The makefile in the vm directory also has specific targets that you can build if you want more granularity, e.g. omr_ddrmacros omrsig j9omrport but there may be ordering dependencies with the other targets in that makefile so I've always just let it build all.

lmajewski commented 6 years ago

@ymanton I've poked a bit into e500 Reference manual: https://www.nxp.com/docs/en/reference-manual/E500CORERM.pdf

In the point "A.1.1.6 Compare and Swap" it is stated that for this core lwarx and stwcx. only work on word size data (32bits). To me It seems that "simple" replacement of instructions will not provide proper atomic operation of "compare and swap of a 64-bit value on a 32-bit system" J9CAS8Helper function:

uint64_t J9CAS8Helper(volatile uint64_t *addr, uint32_t compareLo, uint32_t compareHi, uint32_t swapLo, uint32_t swapHi);

IMHO we would need to use lwarx/stwcx. functions to read and operate separately on compare{HI|Lo} and swap{Hi|Lo}. After the successful CAS operation we would need to repeat it and compare the results (a bit different problem described in [1]).

Have I overlooked something? Or is there any other/better solution for this? (I'm wondering how IBM's original J9 implementation handled this for 32bit PPC :-) )

[1] - https://stackoverflow.com/questions/45054323/powerpc-e500-p1020-read-64bit-2x32bit-registers-in-atomic-way

ymanton commented 6 years ago

Yes you are correct, simply replacing ldarx/stdcx with lwarx/stwcx will not work. IBM's original implementation is the one you see currently, we really use J9CAS8Helper and ldarx/stdcx. :grin: The IBM JDK only supports 64-bit CPUs, even in 32-bit mode, since 64-bit instructions and registers can still be used in 32-bit mode.* The IBM JDK is mostly used on IBM POWER systems (we support POWER4 and later), but we've also used and tested it on chips like the PPC970 and the e5500, all of which are 64-bit capable. I don't know if we supported true 32-bit chips in the distant past, but you and @wyatt8740 are the first I've seen ask about it.

Anyway, the patches I pointed you to should solve the problem, they use lwarx/stwcx and a dedicated 32-bit lock word to synchronize on and will not call J9CAS8Helper. See here:

https://github.com/eclipse/omr/blob/e3972b55a2235a3e04b90083523e2242ff38e4aa/include_core/AtomicSupport.hpp#L414-L430

However you cannot even build OpenJ9 because your assembler will not tolerate unsupported instructions. Can you try building with -mppc64 here:

https://github.com/lmajewski/ppc32_j9_omr/blob/58a9411ebae7980c9d2cd4dbf2dcbd5e3707bda9/omrmakefiles/rules.linux.mk#L113

We will not execute 64-bit assembly routines in 32-bit mode except J9CAS8Helper (and with my patches even it will not be executed) so it should be OK to have them in your VM, we can work on excluding them from the build later. I started a build in QEMU using Debian 8 and it has finished building the VM parts so it seems to be accepted by my assembler, but like you said the parts of the build that invoke Java are incredibly slow so it has still been building for the last 12 hours. Unfortunately the #ifdefs in my original patch are broken and export VMDEBUG="-DOMR_NO_64BIT_LCE" will not work as expected in OMR. I'll fix that shortly, but can you simply replace OMR_NO_64BIT_LCE with 1 for now? You can do something like sed -i s/OMR_NO_64BIT_LCE/1/g ....

IMHO we would need to use lwarx/stwcx. functions to read and operate separately on compare{HI|Lo} and swap{Hi|Lo}. After the successful CAS operation we would need to repeat it and compare the results (a bit different problem described in [1]).

[1] - https://stackoverflow.com/questions/45054323/powerpc-e500-p1020-read-64bit-2x32bit-registers-in-atomic-way

Unfortunately I don't think a trick like the one for reading the time base will work. Reading the time base does not have to be atomic, and no software threads will ever race to write to it, so it is an easier problem to solve. Since the time base is monotonically increasing and the high 32 bits are will not change within the span of a few instructions you can always detect when you have mismatching hi/lo parts. With arbitrary 64 bit values however you cannot swap hi/lo and detect if your hi/lo gets mixed up with the lo/hi of another thread atomically, there will be a quantum of time where the hi/lo of two threads can be in memory and it will probably lead to rare but painful bugs. Perhaps it could be done with nested lwarx/stwcx, but the architecture does not allow that. You can try to come up with an algorithm if you wish, I would be happy to look at it because it might be more convenient than my solution, but I'm not hopeful it can be done.

* In practice ldarx/stdcx are the only "safe" 64-bit instructions that can be used in 32-bit mode because kernels may not preserve the upper 32 bits of registers in 32-bit mode, but proper kernels will also force l*arx/st*cx to fail on any interrupt, so you can use them and also place other 64-bit instructions between them and be protected from context switches.

lmajewski commented 6 years ago

Anyway, the patches I pointed you to should solve the problem, they use lwarx/stwcx and a dedicated 32-bit lock word to synchronize on and will not call J9CAS8Helper. See here:

It seems like I wrongly defined OMR_ARCH_POWER in my build so the OMRCAS8Helper() (or J9CAS8Helper) is called.

From the code: https://github.com/eclipse/omr/blob/e3972b55a2235a3e04b90083523e2242ff38e4aa/include_core/AtomicSupport.hpp#L414-L430

This solution seems to be reusing the already available 32 bit functions. I will give it a try.

However, on the line 436: return __sync_val_compare_and_swap(address, oldValue, newValue); This is a gcc 4.2 built-in function. Unfortunately, it has been replaced in 4.8.2. with _atomic* version. The code to implement J9CAS8Helper with built-ins:

static inline uint64_t J9CAS8Helper(volatile uint64_t *addr, uint32_t compareLo, uint32_t compareHi, uint32_t swapLo, uint32_t swapHi) { uint64_t exp = (((uint64_t)compareHi) << 32) | compareLo; uint64_t des = (((uint64_t)swapHi) << 32) | swapLo; uint64_t val; bool ret; do { __atomic_load (addr, &val, __ATOMIC_SEQ_CST); ret = __atomic_compare_exchange (addr, &exp, &des, false, __ATOMIC_SEQ_CST,__ATOMIC_SEQ_CST); } while (!ret); return val; }

The problem above is that __atomic_compare_exchange is not "returning" the old *addr value (it does it only on failure - this is the difference from __sync_val_compare_and_swap available on gcc 4.2). There may be a race between __atomic_load() and __atomic_compare_exchange(). Moreover, one needs to add -latomic switch to gcc.

However, I will try your patches again - so I could avoid adding -mppc64 where possible.

lmajewski commented 6 years ago

It took me some time, but I've managed to compile natively the J9 for e500_v2:

root@qoriq:/mnt/openj9-openjdk-jdk8# ./build/linux-ppc-normal-zero-release/images/j2re-image/bin/java -version openjdk version "1.8.0_181-internal" OpenJDK Runtime Environment (build 1.8.0_181-internal-b14) Eclipse OpenJ9 VM (build openj9-ppc32-fixes-c5a6251, JRE 1.8.0 Linux ppc-32-Bit 20180912_000000 (JIT enabled, AOT enabled) OpenJ9 - c5a6251 OMR - 59e927e JCL - c186542

However, I do need to run some validation tests for it. Any recommendations (despite compiling some JAVA code and check if it is not crashing)?

Please find updated repositories: https://github.com/lmajewski/ppc32_j9_omr https://github.com/lmajewski/ppc32_j9_openj9 https://github.com/lmajewski/ppc32_j9_openj9-openjdk-jdk8

The trick was to properly use patches from @ymanton :-)

As mentioned above - I had to export VMDEBUG="-DOMR_NO_64BIT_LCE" and also rebuild some files manually with -mppc64 (e.g. CAS8 helper). More info is in the top directory lukma_* files (as I've been using OpenJDK8 zero for compilation).

There is however a room for improvement - I've blindly replaced lwsync with sync, which is painfully slow. The e500 does support 'msync' which probably shall be used instead.

Moreover, during compilation, I saw some warnings regarding 32 bit shifts, which doesn't look good on 32 bit machine.

JamesKingdon commented 6 years ago

There are a couple of 32 shift out of range warnings that happen during 32 bit builds, which I'm assured are on a code path that doesn't execute on 32 bit platforms. I'd be happier if we cleaned those up :)

ymanton commented 6 years ago

Good to hear that you got it built. If you find that OMRCAS8Helper is still being called it's likely because some files from OMR are built without VMDEBUG included in the command line so the #ifdef guarded code is removed, but you can hack around that.

lmajewski commented 6 years ago

@ymanton Any hint on built J9 validation process?

PTamis commented 6 years ago

I guess that running https://www.spec.org/jvm2008/ would be a really good test sample. It has lots of tests that can be performed.

ymanton commented 6 years ago

@lmajewski If you want to run the OpenJ9 regression tests you can try these instructions:

export JAVA_BIN=/path/to/build/images/j2sdk-image/jre/bin
export SPEC=linux_ppc
export JAVA_VERSION=SE80

cd openj9/test/TestConfig
make -f run_configure.mk
make test

Building DDR_Test failed for me so I just disabled it via:

--- a/test/functional/build.xml
+++ b/test/functional/build.xml
@@ -63,6 +63,7 @@
                                        <fileset dir="." includes="*/build.xml" >
                                                <exclude name="Panama/build.xml" />
                                                <exclude name="Valhalla/build.xml" />
+                                               <exclude name="DDR_Test/build.xml" />
                                        </fileset>
                                </subant>
                        </else>

Documentation on testing is here if you want to know more: https://github.com/eclipse/openj9/blob/master/test/docs/OpenJ9TestUserGuide.md

I've managed to get a ppc32 JVM built with -m32 -mcpu=G4 on a ppc64 server and tested in QEMU on a Debian 8 + G4 combination. I ran a simple program with the JIT disabled and it worked. With the JIT enabled it crashed while the JIT was compiling a method. That's probably a bug in the JIT so I'll look at that shorty. The regression tests are currently running on ppc64 so it won't catch any illegal instructions o real 32-bit chips, but it should test functionality.

I've updated https://github.com/eclipse/omr/pull/2930 and https://github.com/eclipse/openj9/pull/2764 and the patches I used for openj9-openjdk-jdk8 are here: https://github.com/ibmruntimes/openj9-openjdk-jdk8/pull/113. Still has lots of rough edges but I'll do a bit more later, feel free to use what's there in your efforts.

PTamis commented 6 years ago

I compiled also on a native e500v2 core the openj9 with the instructions given. The compile finished OK. But when I am trying to run java -version I have the following crash at JIT library.

` bt

0 0x0f1a0d60 in OMR::CodeGenerator::addAllocatedRegisterPair(TR::RegisterPair*) () from /mnt/persistent/tamis-openj9/openj9-openjdk-jdk8-lucasz/build/linux-ppc-normal-zero-fastdebug/images/j2re-image/lib/ppc/default/libj9jit29.so

1 0x0f1a1284 in OMR::CodeGenerator::allocateRegisterPair(TR::Register, TR::Register) ()

from /mnt/persistent/tamis-openj9/openj9-openjdk-jdk8-lucasz/build/linux-ppc-normal-zero-fastdebug/images/j2re-image/lib/ppc/default/libj9jit29.so

2 0x0f0505dc in TR::PPCPrivateLinkage::buildDirectDispatch(TR::Node*) () from /mnt/persistent/tamis-openj9/openj9-openjdk-jdk8-lucasz/build/linux-ppc-normal-zero-fastdebug/images/j2re-image/lib/ppc/default/libj9jit29.so

3 0x0f034f10 in J9::Power::TreeEvaluator::directCallEvaluator(TR::Node, TR::CodeGenerator) ()

from /mnt/persistent/tamis-openj9/openj9-openjdk-jdk8-lucasz/build/linux-ppc-normal-zero-fastdebug/images/j2re-image/lib/ppc/default/libj9jit29.so

4 0x0f198ca8 in OMR::CodeGenerator::evaluate(TR::Node*) () from /mnt/persistent/tamis-openj9/openj9-openjdk-jdk8-lucasz/build/linux-ppc-normal-zero-fastdebug/images/j2re-image/lib/ppc/default/libj9jit29.so

5 0x0f57cbe8 in OMR::Power::TreeEvaluator::treetopEvaluator(TR::Node, TR::CodeGenerator) ()

from /mnt/persistent/tamis-openj9/openj9-openjdk-jdk8-lucasz/build/linux-ppc-normal-zero-fastdebug/images/j2re-image/lib/ppc/default/libj9jit29.so

6 0x0f198ca8 in OMR::CodeGenerator::evaluate(TR::Node*) () from /mnt/persistent/tamis-openj9/openj9-openjdk-jdk8-lucasz/build/linux-ppc-normal-zero-fastdebug/images/j2re-image/lib/ppc/default/libj9jit29.so

7 0x0eea5570 in J9::CodeGenerator::doInstructionSelection() () from /mnt/persistent/tamis-openj9/openj9-openjdk-jdk8-lucasz/build/linux-ppc-normal-zero-fastdebug/images/j2re-image/lib/ppc/default/libj9jit29.so

8 0x0f1a9a68 in OMR::CodeGenPhase::performInstructionSelectionPhase(TR::CodeGenerator, TR::CodeGenPhase) ()

from /mnt/persistent/tamis-openj9/openj9-openjdk-jdk8-lucasz/build/linux-ppc-normal-zero-fastdebug/images/j2re-image/lib/ppc/default/libj9jit29.so

9 0x0f1a570c in OMR::CodeGenPhase::performAll() () from /mnt/persistent/tamis-openj9/openj9-openjdk-jdk8-lucasz/build/linux-ppc-normal-zero-fastdebug/images/j2re-image/lib/ppc/default/libj9jit29.so

10 0x0f1a33d4 in OMR::CodeGenerator::generateCode() () from /mnt/persistent/tamis-openj9/openj9-openjdk-jdk8-lucasz/build/linux-ppc-normal-zero-fastdebug/images/j2re-image/lib/ppc/default/libj9jit29.so

11 0x0f1c2498 in OMR::Compilation::compile() () from /mnt/persistent/tamis-openj9/openj9-openjdk-jdk8-lucasz/build/linux-ppc-normal-zero-fastdebug/images/j2re-image/lib/ppc/default/libj9jit29.so

12 0x0eed8fc0 in TR::CompilationInfoPerThreadBase::compile(J9VMThread, TR::Compilation, TR_ResolvedMethod, TR_J9VMBase&, TR_OptimizationPlan, TR::SegmentAllocator const&) ()

from /mnt/persistent/tamis-openj9/openj9-openjdk-jdk8-lucasz/build/linux-ppc-normal-zero-fastdebug/images/j2re-image/lib/ppc/default/libj9jit29.so

13 0x0eed9e00 in TR::CompilationInfoPerThreadBase::wrappedCompile(J9PortLibrary, void) ()

from /mnt/persistent/tamis-openj9/openj9-openjdk-jdk8-lucasz/build/linux-ppc-normal-zero-fastdebug/images/j2re-image/lib/ppc/default/libj9jit29.so

14 0x0fa0a4c8 in omrsig_protect () from /mnt/persistent/tamis-openj9/openj9-openjdk-jdk8-lucasz/build/linux-ppc-normal-zero-fastdebug/images/j2re-image/lib/ppc/default/libj9prt29.so

15 0x0eedb440 in TR::CompilationInfoPerThreadBase::compile(J9VMThread, TR_MethodToBeCompiled, J9::J9SegmentProvider&) ()

from /mnt/persistent/tamis-openj9/openj9-openjdk-jdk8-lucasz/build/linux-ppc-normal-zero-fastdebug/images/j2re-image/lib/ppc/default/libj9jit29.so `

@ymanton is the stack the same as yours? And if I also disable JIT with -Xnojit the ./java -version does not return :(