twitter / hadoop-lzo

Refactored version of code.google.com/hadoop-gpl-compression for hadoop 0.20
GNU General Public License v3.0
546 stars 329 forks source link

Remove LZO global lock "hack" that was for an old JVM bug #128

Open skwslide opened 6 years ago

skwslide commented 6 years ago

LzoCompressor and LzoDecompressor have a global lock protecting all GetDirectBufferAddress calls. This PR removes the global lock.

Apparently, the global lock's origin is HADOOP-3604, to work around JDK-6852404, but which has been fixed for several years in all Java versions.

The global lock imposes a severe performance penalty in a highly multithreaded environment. I have a Java process with 64 threads on a 64-core machine to decompress LZO files concurrently, and around half of the threads are blocked on the global lock at any moment. The problem is not observed in a 16-way concurrency setup.

sjlee commented 6 years ago

Thanks for reporting the issue and also providing the fix. It's a good find, and the fix LGTM.

I am no longer at Twitter, and ideally a Twitter engineer should also approve this and merge it.

cc @jrottinghuis

jrottinghuis commented 6 years ago

Looks reasonable indeed. Builds locally. I'll try to test this on a cluster and see if I can tell any performance difference, report back and merge (assuming positive results).

CLAassistant commented 5 years ago

CLA assistant check
Thank you for your submission! We really appreciate it. Like many open source projects, we ask that you sign our Contributor License Agreement before we can accept your contribution.
You have signed the CLA already but the status is still pending? Let us recheck it.