ixy-languages / ixy.java

The Ixy user space network driver ported to Java 12
GNU General Public License v2.0
20 stars 3 forks source link

Unsafe instances should reside in `static final` fields #5

Open shipilev opened 4 years ago

shipilev commented 4 years ago

This is better captured by the microbenchmark. Unsafe performance relies heavily on the JIT compiler being able to nuke down everything to the raw memory accesses.

Part of that story is null-checking the Unsafe instance itself. When compiler is not sure about it (and it does not right now, as the field is just final right now in UnsafeMemoryManager), it would emit null-checks and then barrier coalescing would break.

Copy-pasting the results from the microbenchmark test above:

Benchmark                         Mode  Cnt   Score   Error  Units
UnsafeProblem1.test               avgt    9  19.489 ± 0.288  ns/op
UnsafeProblem1.test_final         avgt    9  19.397 ± 0.189  ns/op <--- this is what ixy.java does now
UnsafeProblem1.test_static        avgt    9  19.492 ± 0.118  ns/op
UnsafeProblem1.test_static_final  avgt    9   6.355 ± 0.036  ns/op <--- this is what it should be doing

This seems to be easy to do:

diff --git a/ixy/src/main/java/de/tum/in/net/ixy/memory/UnsafeMemoryManager.java b/ixy/src/main/java/de/tum/in/net/ixy/memory/UnsafeMemoryManager.java
index b339d88..deab3bb 100644
--- a/ixy/src/main/java/de/tum/in/net/ixy/memory/UnsafeMemoryManager.java
+++ b/ixy/src/main/java/de/tum/in/net/ixy/memory/UnsafeMemoryManager.java
@@ -70,13 +70,13 @@ class UnsafeMemoryManager implements MemoryManager {
        @EqualsAndHashCode.Include
        @ToString.Include(name = "unsafe", rank = 1)
        @SuppressWarnings({"UseOfSunClasses", "PackageVisibleField"})
-       final @Nullable Unsafe unsafe;
+       static final @Nullable Unsafe unsafe = initUnsafe();

        ////////////////////////////////////////////////// MEMBER METHODS //////////////////////////////////////////////////

-       /** Package-private constructor that sets the {@link #unsafe} field. */
+       /** Private method to initialize the {@link #unsafe} instance. */
        @SuppressWarnings({"NestedTryStatement", "UseOfSunClasses"})
-       UnsafeMemoryManager() {
+       static Unsafe initUnsafe() {
                Unsafe tmp = null;
                try {
                        if (DEBUG >= LOG_TRACE) log.trace("Getting declared field 'theUnsafe' from 'sun.misc.Unsafe'.");
@@ -85,7 +85,7 @@ class UnsafeMemoryManager implements MemoryManager {
                        theUnsafeField.setAccessible(true);
                        try {
                                if (DEBUG >= LOG_TRACE) log.trace("Getting the 'sun.misc.Unsafe' instance from the declared field.");
-                               tmp = (Unsafe) theUnsafeField.get(null);
+                               return (Unsafe) theUnsafeField.get(null);
                        } catch (final IllegalArgumentException | IllegalAccessException e) {
                                if (DEBUG >= LOG_ERROR) log.error("Error getting field value.", e);
                        } finally {
@@ -94,9 +94,8 @@ class UnsafeMemoryManager implements MemoryManager {
                        }
                } catch (final NoSuchFieldException | SecurityException e) {
                        if (DEBUG >= LOG_ERROR) log.error("Error getting declared field.", e);
-               } finally {
-                       unsafe = tmp;
                }
+               return null;
        }

        /////////////////////////////////////////////// UNSUPPORTED METHODS ////////////////////////////////////////////////

(Unfortunately, I don't have the hardware to test it... yet).

emmericp commented 4 years ago

Interesting, I'd never have thought of that, I guess I'm too used to optimize code for tracing JITs (LuaJIT in particular).

Changing the field to static indeed boosts performance by 3% when applied on top of https://github.com/ixy-languages/ixy.java/issues/6

Thanks! I'd love to update the performance graphs, but we'll figure out why it doesn't work on the reference system with the 3.3 GHz CPU used by all the other tests first :/

shipilev commented 4 years ago

Right. I would probably apply this much simpler fix first -- if there are consecutive Unsafe acceses, just static final-ing the Unsafe holder field might give a significant boost, even without fixing #6. That is, it might be even more than 3% that way around.