We ran into heap utilization issues when packaging an application with ~100K files. Heap dump analysis with YourKit indicates several problems.
First, and seemingly the worst offender, JDeb generates the checksum manifest in memory, in a massive StringBuilder instance, rather than writing to a file:
The remaining issues are due to the impedance mismatch between Gradle and JDeb. They don't add up to a huge problem, but does surface quite a lot of inefficiency.
We allocate each directory/file as a DataProducer, rather than having a single producer provide all of the files and directories, leading to us holding a very large number of producers in DebCopyAction:
dataProducers of com.netflix.gradle.plugins.deb.DebCopyAction [Stack Local ← action, copyAction, object, receiver, this] 74460584 88
Further, that means we have stack locals holding all of the visited files and directories - couldn't find where these were referenced, assuming they're Gradle related also::
java.util.HashSet [Stack Local ← visitedFiles] size = 94583 26989336
java.util.HashSet [Stack Local ← visitedDirs] size = 12661 1596936
Every visited file is also held for the purposes of duplicate handling:
val$visitedFiles of org.gradle.api.internal.file.copy.DuplicateHandlingCopyActionDecorator$1 [Stack Local ← stream] 24 24
Gradle's tracking of the task up to date state is somewhat expensive also:
We ran into heap utilization issues when packaging an application with ~100K files. Heap dump analysis with YourKit indicates several problems.
First, and seemingly the worst offender, JDeb generates the checksum manifest in memory, in a massive
StringBuilder
instance, rather than writing to a file:The remaining issues are due to the impedance mismatch between Gradle and JDeb. They don't add up to a huge problem, but does surface quite a lot of inefficiency.
We allocate each directory/file as a DataProducer, rather than having a single producer provide all of the files and directories, leading to us holding a very large number of producers in DebCopyAction:
Further, that means we have stack locals holding all of the visited files and directories - couldn't find where these were referenced, assuming they're Gradle related also::
Every visited file is also held for the purposes of duplicate handling:
Gradle's tracking of the task up to date state is somewhat expensive also: