eclipse-openj9 / openj9

Eclipse OpenJ9: A Java Virtual Machine for OpenJDK that's optimized for small footprint, fast start-up, and high throughput. Builds on Eclipse OMR (https://github.com/eclipse/omr) and combines with the Extensions for OpenJDK for OpenJ9 repo.
Other
3.27k stars 720 forks source link

Remove old transactional memory support in JIT Power codegen #13324

Open ymanton opened 3 years ago

ymanton commented 3 years ago

Transactional memory is no longer supported on IBM Power processors (which were the only PPC processors to ever support it) as of P10. TM has also been disabled by default in OpenJ9 for a long time and some of the functionality relied on classes that don't/won't exist in OpenJ9 so this code is effectively dead and untested and should be removed unless there's good reason not to.

See below for context:

This looks like the right thing to do, I'm just wondering how the existing code survived this long without it. There's clearly internal control flow going on in this routine without even considering what VMnonNullSrcWrtBarCardCheckEvaluator does.

Putting on my technical debt reduction hat for a minute, I think it's because this code only runs when transactional memory is available on the processor, enabled in the JVM/JIT, and then only with a specific version of the ConcurrentLinkedQueue class (that has extra methods related to TM) that doesn't exist in OpenJ9. Long story short I think this code is dead and we should delete it. @fjeremic is also working on changes to how we do register dependencies to free us from having to count how many we deps we need so hopefully we can clean all of this up one day soon.

_Originally posted by @ymanton in https://github.com/eclipse-openj9/openj9/pull/13307#discussion_r687924452_

0xdaryl commented 2 years ago

Intel is in the process of deprecating their TSX support as well due to memory ordering problems causing issues. The feature has been removed in some Ice Lake processors and Intel microcode updates are beginning to appear to disable TSX function and force-abort RTM transactions on older processors (e.g., back to Skylake) [1][2]. This doesn't affect AMD as it never supported it.

I propose removing TSX support in the x86 backend as a result (no customer impact is expected). We never realized any known benefit from it.

However, I wonder if this issue should be generalized to remove transactional memory support altogether from the JIT. Does Z realize any benefit with TM that we know of? @joransiu

FYI @zl-wang @mstoodle @vijaysun-omr

[1] https://cdrdv2.intel.com/v1/dl/getContent/604224 [2] https://www.phoronix.com/scan.php?page=news_item&px=Intel-TSX-Off-New-Microcode

fjeremic commented 2 years ago

However, I wonder if this issue should be generalized to remove transactional memory support altogether from the JIT. Does Z realize any benefit with TM that we know of? @joransiu

Personally I think we should remove it. We do use TM though to accelerate ConcurrentLinkedQueue but only at a recognized method level. I have yet to see a tbegin / tend IL using in the wild. It seems prohibitively limited (in what IL can be part of a TM sequence) that we can almost never find a sequence of trees to actually perform TM reductions on.

joransiu commented 2 years ago

Somewhat timely discussion. We collected a set of Liberty DayTrader results specifically to measure with and without TX support on z/OS. From profiles, we do spent 1% of execution time within Liberty inside TX transactions. However, the throughput, unsurprisingly, was pretty within 1% of each other -- so par performance in practical terms.

The TX support does come into play for smaller multi-threaded applications and benchmarks. A recent example where it showed up was from crypto.perf, where we do see TX usage in both the bumblebench related methods (I believe doBatch()`, as well as, in some of the crypto implementations (i.e. AESCipher.engineUpdate).

I do wonder if we have other key application scenarios (i.e. some of the contention observed in class loader) whether TX may help or not. It's one of those technologies where generally it may not help, but in the right situation, it can buy us much better scalability.

0xdaryl commented 2 years ago

I agree the potential is (and has always been) there, but I'm not aware of a benefit running a real workload. It may be useful to survey the landscape as Joran suggests (I know, for example, there have been thoughts around using TM to help with contended lock scenarios).

However, given the direction of the other hardware vendors it would seem that TM exploitation will be a Z-only feature for the immediate future.

joransiu commented 2 years ago

A statement of direction was given on IBM Z:

Removal of support of the transactional execution and constrained transactional execution facility: In a future IBM Z hardware system family, the transactional execution and constrained transactional execution facility will no longer be supported. Users of the facility on current servers should always check the facility indications before use.