[ECJ] Increased Heap Usage for 1.8 Source Option

volosied commented 11 months ago

Originally send an email here: https://www.eclipse.org/lists/jdt-dev/msg02297.html

Version: ECJ 4.20 (We cannot test any newer ECJ releases since our application and server run on Java 8)

Problem: The issue is that the 1.8 source/target compiler option utilizes a lot more memory than the 1.7 option when compiling translated JSPs.

1.7 requires 1GB of heap while 1.8 requires 5 GB. Here's a snippet of the heapdump:

The stacktrace for Thread-173 is:

CPU usage total: 212.539956241 secs, current category="Application"
Heap bytes allocated since last GC cycle=0 (0x0)
Java callstack:
    at org/eclipse/jdt/internal/compiler/lookup/InferenceContext18.substitute(InferenceContext18.java:245(Compiled Code))
    at org/eclipse/jdt/internal/compiler/lookup/InferenceContext18.inferInvocationType(InferenceContext18.java:398(Compiled Code))
    at org/eclipse/jdt/internal/compiler/lookup/ParameterizedGenericMethodBinding.computeCompatibleMethod18(ParameterizedGenericMethodBinding.java:271(Compiled Code))
    at org/eclipse/jdt/internal/compiler/lookup/ParameterizedGenericMethodBinding.computeCompatibleMethod(ParameterizedGenericMethodBinding.java:92(Compiled Code))
    at org/eclipse/jdt/internal/compiler/lookup/Scope.computeCompatibleMethod(Scope.java:844(Compiled Code))
    at org/eclipse/jdt/internal/compiler/lookup/Scope.computeCompatibleMethod(Scope.java:801(Compiled Code))
    at org/eclipse/jdt/internal/compiler/lookup/Scope.findMethod0(Scope.java:1759(Compiled Code))
    at org/eclipse/jdt/internal/compiler/lookup/Scope.findMethod(Scope.java:1660(Compiled Code))
    at org/eclipse/jdt/internal/compiler/lookup/Scope.getMethod(Scope.java:3048(Compiled Code))
    at org/eclipse/jdt/internal/compiler/ast/MessageSend.findMethodBinding(MessageSend.java:1018(Compiled Code))
    at org/eclipse/jdt/internal/compiler/ast/MessageSend.resolveType(MessageSend.java:839(Compiled Code))
    at org/eclipse/jdt/internal/compiler/ast/_expression_.resolve(_expression_.java:1113(Compiled Code))
    at org/eclipse/jdt/internal/compiler/ast/Block.resolve(Block.java:131(Compiled Code))
    at org/eclipse/jdt/internal/compiler/ast/IfStatement.resolveIfStatement(IfStatement.java:291(Compiled Code))
    at org/eclipse/jdt/internal/compiler/ast/IfStatement.resolve(IfStatement.java:317(Compiled Code))
    at org/eclipse/jdt/internal/compiler/ast/AbstractMethodDeclaration.resolveStatements(AbstractMethodDeclaration.java:661(Compiled Code))
    at org/eclipse/jdt/internal/compiler/ast/MethodDeclaration.resolveStatements(MethodDeclaration.java:362(Compiled Code))
    at org/eclipse/jdt/internal/compiler/ast/AbstractMethodDeclaration.resolve(AbstractMethodDeclaration.java:570(Compiled Code))
    at org/eclipse/jdt/internal/compiler/ast/TypeDeclaration.resolve(TypeDeclaration.java:1512(Compiled Code))
    at org/eclipse/jdt/internal/compiler/ast/TypeDeclaration.resolve(TypeDeclaration.java:1637(Compiled Code))
    at org/eclipse/jdt/internal/compiler/ast/CompilationUnitDeclaration.resolve(CompilationUnitDeclaration.java:667(Compiled Code))
    at org/eclipse/jdt/internal/compiler/Compiler.process(Compiler.java:902)
    at org/eclipse/jdt/internal/compiler/Compiler.processCompiledUnits(Compiler.java:575)
    at org/eclipse/jdt/internal/compiler/Compiler.compile(Compiler.java:475)
    at org/eclipse/jdt/internal/compiler/Compiler.compile(Compiler.java:426)
    at com/ibm/ws/jsp/translator/compiler/JDTCompiler.compile(JDTCompiler.java:178)

I can provide more information if needed. This behavior doesn't seem right, but perhaps it is normal based on the new Java 8 language features? I would appreciate if someone could provide an explanation -- I don't have the knowledge to figure this out myself.

Thank you!

jarthana commented 11 months ago

I see quite a bit changed with respect to the inference in Java 1.8. So, it's possible we create few more objects since 1.8. But three times the heap consumption doesn't sound right.

Although the stack doesn't include, I suspect the extra type bindings are created by this code - org.eclipse.jdt.internal.compiler.lookup.Scope.Substitutor.substitute(Substitution, TypeBinding)

Copying @stephan-herrmann and @srikanth-sankaran who may have some idea about this part of the compiler.

mpalat commented 11 months ago

@volosied Could you please share a reproducible code snippet here?

volosied commented 8 months ago

Unfortunately, I can't gather more at this time. However, we found a workaround by compiling in batches rather than all at once.

Feel free to close this ticket, and thanks for your assistance!

stephan-herrmann commented 8 months ago

What do we know?

Sure, type inference in 1.8 cannot really be compared to 1.7 - this alone doesn't directly identify any suspect
The TypeSystem instance is root of the memory problem (it's not an AnnotatedTypeSystem, so this is not about annotations).
Type inference does seem to have a finger in the pie.

Here's a theory: during type inference we may need to create intermediate ParameterizedTypeBindings, which all are stored in TypeSystem. Normally, such storing is the correct thing to do, to ensure identity of type bindings. But when during type inference some type arguments are InferenceVariables or just contain such at any nesting level, then the resulting ParameterizedTypeBinding may cause some form a memory leak: semantically, the life span of any inference variable is just the length of resolving the outer-most expression involved. But by storing in TypeSystem all those inference variables will survive for the entire compiler invocation (i.e., until the pair of LookupEnvironment and TypeSystem will be freed). In fact not just the inference variable, but the enclosing parameterized type binding is wasted, too. And they might transitively keep more objects alive.

@srikanth-sankaran does the master of TypeSystem see a way to clean up the cache at the end of (outermost) type inference, without creating a new disproportionate performance penalty (cpu-wise)? During experiments, I found that in huge generic types even asking isProperType() may have significant cost, ups.

Should inference itself remember all non-proper parameterized type bindings created during its course, perhaps each inference variable could keep track of those derived bindings, for easy deallocation?

Or would this create new risks in TypeSystem when after freeing we re-use a previously used slot?

stephan-herrmann commented 8 months ago

Btw, this stack frame is not JDT:

    at org/eclipse/jdt/internal/compiler/ast/_expression_.resolve(_expression_.java:1113(Compiled Code))

Should we worry about that?

I wonder, how exactly com/ibm/ws/jsp/translator/compiler/JDTCompiler relates to ecj.

stephan-herrmann commented 8 months ago

@jukzi jukzi closed this as not planned Feb 14, 2024

Before this goes into neverland let me ask @srikanth-sankaran : have you seen my theory around TypeSystem holding types that are strictly obsolete after the inference session that created them? I believe we could play with those mechanisms even without a strict reproducer plus heap measurements?

srikanth-sankaran commented 8 months ago

I'll take a look.

eclipse-jdt / eclipse.jdt.core

[ECJ] Increased Heap Usage for 1.8 Source Option #1608