sbt / zinc

Scala incremental compiler library, used by sbt and other build tools
Apache License 2.0
335 stars 121 forks source link

CompileAnalysis contains many redundant strings #547

Closed stuhood closed 3 weeks ago

stuhood commented 6 years ago

While upgrading our repository from zinc 0.13.9 to 1.1.7, we experienced an increase in memory usage. After investigation, it was likely because during the upgrade we switched to using name hashing (by virtue of the previous implementation having been removed). We've found other solutions to this increase, so it isn't currently a blocker.

But while inspecting heap dumps during the investigation, I noticed that the object graph below CompileAnalysis contains highly redundant strings. For one module, we saw 2708 distinct occurrences of the string "java" (representing ~130KB on the heap), 789 instances of "equals", 287 instances of "findByName", etc.

The distribution of occurrences and their redundancy for that module was:

redundancy  count
1       8210
2       809
4       550
8       839
16      423
32      68
64      24
128     26
256     2
512     33
1024        17
2048        4
4096        2

(ie, there were 17 distinct strings that occurred between 512 and 1024 times in this module alone)


I started in on a patch to intern strings during protobuf deserialization, which I expect would significantly reduce this redundancy.

But while doing that it occurred to me that a more efficient and typesafe solution would be to change the protobuf schema to store the interned string values once (ie, by replacing string with message InternKey {int32}). This would be similar to the Java classfile constant pool.

jvican commented 6 years ago

Funny that you report now, I was profiling this a week ago and also saw the brutal amount of objects that Zinc was allocating. Aside from the strings I also saw a lot of name hashes created and other intermediary objects that could be removed.

I don't know to which extent the protobuf deserialization logic is involved with the creation of these objects. I would expect that the majority of them are created from the bridge when compilation happens. I think the best way to solve this problem is via pooling (as we do in the compiler with term and type names).

But while doing that it occurred to me that a more efficient and typesafe solution would be to change the protobuf schema to store the interned string values once (ie, by replacing string with message InternKey {int32}). This would be similar to the Java classfile constant pool.

I agree this is a superior solution than String.intern, which can have unexpected results if the JVM string table gets too full. Are you going to submit a patch with this improvement?

I wanted to take a shot at reducing the high consumption of name hashes during compilation this weekend (the other main source of allocation). Let me know if we're not stepping on each other's toes! I'm happy to defer these tasks to you if you're already looking into all of them.

stuhood commented 6 years ago

protobuf deserialization logic

To be clear, I'm not blaming the deserialization logic: since most of the objects created there are short lived, they should get cleaned up quite easily. I was just editing there because that was the easiest place to introduce the interning.

Rather, I'm concerned with the memory held by a CompileAnalysis object as it is held in memory over longer periods. Before switching from loose classfile inputs to jars, we were not able to hold all of the CompileAnalysis for the dependencies of our largest targets in a 6GB heap at once, which meant that the upgrade was causing OOMs (again, "fixed" by switching to jar classpaths).

So it's not about throughput, but rather about total memory usage.

stuhood commented 6 years ago

I agree this is a superior solution than String.intern, which can have unexpected results if the JVM string table gets too full. Are you going to submit a patch with this improvement?

I could, but it would be a big protobuf schema change... is there appetite for that?

jvican commented 6 years ago

I would welcome that change. So long as we can keep the old reader and and a new reader/writer for the same version, it's good for me (so that we can reuse previously generated analysis files).

The problem we're trying to solve here is pretty fundamental so I see value in it. This would also reduce the size of the analysis file considerably.

borkaehw commented 6 years ago

We are interested in this issue. Do you have any update and what is the preferred approach to fix it? Thanks.

jvican commented 6 years ago

I have some staged changes to fix this, so I'll submit them soon. After the PR, any help regarding further profiling would be welcome.

Friendseeker commented 1 month ago

I was looking at the #1326 and looks like the intern optimization is gone.

Shall first enable benchmark for analysis serialization/deserialization performance (#1400) and then add the optimization back if it does not affect performance (which I doubt it would).


Nvm #1326 replaced intern optimization with a constant pool of strings in serialized format.