enso-org / enso

Hybrid visual and textual functional programming.
https://enso.org
Apache License 2.0
7.34k stars 321 forks source link

Eliminate `scala.Some` 10% overhead in Enso `IR` by inlining objects together #10973

Open JaroslavTulach opened 1 week ago

JaroslavTulach commented 1 week ago

IR is a deep tree structure that is generated by TreeToIr and processed in subsequent steps by IRPasses producing new copy of the IR in every run. While the copying has been improved by #10718 all the IR structure objects need to be allocated on the heap at the end to be passed to IrToTruffle and held in the memory indefinitely.

Many of the IR (case) classes are using scala.Option which adds overhead of at least 16 bytes (empty object takes 8 bytes, object with a field at least 16 bytes), yet another pointer indirection when accessing the object wrapped by Some(object) and defies cache locality as Some and object needn't be allocated next to each other and may cause processor cache misses.

enso$ javap -cp runtime.jar -private org/enso/compiler/core/ir/Expression.Block
public class org.enso.compiler.core.ir.Expression$Block implements  ... {
  private final scala.Option<org.enso.compiler.core.ir.IdentifiedLocation> location;

Whenever there is an IdentifiedLocation associated with an IR (and there are thousands), the pointer isn't from the IR to the IdentifiedLocation object, but to scala.Some and then to the IdentifiedLocation object. That's wasting 16 bytes of memory per instance. Rather than that we should change the location field to point directly to IdentifiedLocation or null.

There is no need to change the IR API. We only need to change the internal implementation. Only that enough to effectively use the allocated memory. Moreover, because of JVM's escape analysis (see this paper for more details), there is no overhead in wrapping/unwrapping such a location & co. fields by scala.Option when needed - the allocation will be virtualized when it is known the allocated scala.Option doesn't escape:

All we need to do is to find a way to rewrite our IR subclasses to avoid these scala.Option fields. An example how such API should look like is the IdentifiedLocation record (already written in Java) and its id field:

enso$ javap -cp runtime.jar -private org.enso.compiler.core.ir.IdentifiedLocation
public final class org.enso.compiler.core.ir.IdentifiedLocation extends java.lang.Record {
  private final java.util.UUID uuid;

  public org.enso.compiler.core.ir.IdentifiedLocation(org.enso.compiler.core.ir.Location, java.util.UUID);
  public static org.enso.compiler.core.ir.IdentifiedLocation create(org.enso.compiler.core.ir.Location, scala.Option<java.util.UUID>);

  public scala.Option<java.util.UUID> id();
}

The constructor still accepts scala.Option<java.util.UUID> and the getter returns scala.Option<java.util.UUID>. The id() getter does the wrapping when needed, but the reference to uuid is stored effectively in the memory.

JaroslavTulach commented 1 week ago

The measure the potential benefit, let's run following experiment:

diff --git engine/runtime-parser/src/test/java/org/enso/compiler/core/EnsoParserTest.java engine/runtime-parser/src/test/java/org/enso/compiler/core/EnsoParserTest.java
index 249151bec5..0f54b5380a 100644
--- engine/runtime-parser/src/test/java/org/enso/compiler/core/EnsoParserTest.java
+++ engine/runtime-parser/src/test/java/org/enso/compiler/core/EnsoParserTest.java
@@ -41,6 +41,26 @@ public class EnsoParserTest {
     if (ensoCompiler != null) ensoCompiler.close();
   }

+  @Test
+  public void parseHugeFile() throws Exception {
+    var f = new File("../../distribution/lib/Standard/Table/0.0.0-dev/src/Table.enso");
+    assertTrue("File should exist " + f.getCanonicalPath(), f.exists());
+    var src = Files.readString(f.getCanonicalFile().toPath());
+    var ir = ensoCompiler.compile(src);
+    var cnt = new int[2];
+    ir.preorder().foreach(e -> {
+      if (e.location().isDefined()) {
+        cnt[0]++;
+      } else {
+        cnt[1]++;
+      }
+      return null;
+    });
+    System.err.println("Take heap dump now!");
+    Thread.sleep(10000);
+    fail("With location " + cnt[0] + " without location " + cnt[1]);
+  }
+
   @Test
   public void testParseMain7Foo() {
     parseTest("""

execute as

sbt:enso> runtime-parser/testOnly *EnsoParserTest -- -z org.enso.compiler.core.EnsoParserTest.parseHugeFile

the test parses the largest Enso file in our distribution, creates IR for it and stops for 10s - take a heap dump and look for scala.Option in it:

There is a single org.enso.compiler.core.ir.Module instance (as expected) and its retained size (e.g. the whole tree) occupies 1 890 808 bytes. E.g. scala.Some adds 10% overhead to the whole IR size!

JaroslavTulach commented 1 week ago

Based on previous investigation I run a little inlining experiment trying to eliminate overhead of Location reference in IdentifiedLocation. With following change:

enso$ git diff engine/runtime-parser/src/main/java/org/enso/compiler/core/ir/IdentifiedLocation.java
diff --git engine/runtime-parser/src/main/java/org/enso/compiler/core/ir/IdentifiedLocation.java engine/runtime-parser/src/main/java/org/enso/compiler/core/ir/IdentifiedLocation.java
index 14d790adba..5a2e79d779 100644
--- engine/runtime-parser/src/main/java/org/enso/compiler/core/ir/IdentifiedLocation.java
+++ engine/runtime-parser/src/main/java/org/enso/compiler/core/ir/IdentifiedLocation.java
@@ -3,35 +3,46 @@ package org.enso.compiler.core.ir;
 import java.util.UUID;
 import scala.Option;

-public record IdentifiedLocation(Location location, UUID uuid) {
+public record IdentifiedLocation(int start, int end, UUID uuid) {
   public IdentifiedLocation(Location location) {
-    this(location, (UUID) null);
+    this(location.start(), location.end(), null);
+  }
+
+  public IdentifiedLocation(Location location, UUID uuid) {
+    this(location.start(), location.end(), uuid);
   }

   /** Creates new location from an optional UUID. */
   public static IdentifiedLocation create(Location location, Option<UUID> uuid) {
-    return new IdentifiedLocation(location, uuid.isEmpty() ? null : uuid.get());
+    return new IdentifiedLocation(location.start(), location.end(), uuid.isEmpty() ? null : uuid.get());
+  }
+
+  /**
+   * @return location of this identified location
+   */
+  public Location location() {
+    return new Location(start, end);
   }

   /**
    * @return the character index of the start of this source location.
    */
   public int start() {
-    return location().start();
+    return start;
   }

   /**
    * @return the character index of the end of this source location.
    */
   public int end() {
-    return location().end();
+    return end;
   }

   /**
    * @return the length in characters of this location.
    */
   public int length() {
-    return location().length();
+    return end - start;
   }

   /**

the retained size of org.enso.compiler.core.ir.Module goes down by 15% to 1 614 664 bytes

No Location instances

as the Location instances completely disappear from the heap and the IdentifiedLocation instances keep occupying the same amount of space.

Looks like there is potential in inlining objects into each other - and not only of scala.Some!