Guardsquare / proguard

ProGuard, Java optimizer and obfuscator
https://www.guardsquare.com/en/products/proguard
GNU General Public License v2.0
2.82k stars 406 forks source link

Poor performance when overrunning a large obfuscation dictionary #413

Open quat1024 opened 3 months ago

quat1024 commented 3 months ago

In one project, we use an obfuscation dictionary with 10,000 entries. We found that most CPU time was spent in proguard.obfuscate.DictionaryNameFactory.nextName.

It turns out we had more than 10,000 symbols to obfuscate in the project and ran out the dictionary. DictionaryNameFactory falls back to generating fresh names and checking them against the dictionary so it doesn't generate a duplicate, but this duplicate-checking uses a linear scan (List#contains, line 255), so our large dictionary was causing this code path to be very slow.

https://github.com/Guardsquare/proguard/blob/3a9b11bb3c24d45cff5ee2215e818fe1574acb6f/base/src/main/java/proguard/obfuscate/DictionaryNameFactory.java#L250-L255

phase commented 3 months ago

here's a profiler snapshot from VisualVM: image

by expanding our dictionary from 10k to 50k, the problem went away and the time taken by proguard cut in half.

after this was solved, another profile shows another hot loop is taking up a significant portion of time

https://github.com/Guardsquare/proguard-core/blob/9ae93fd664f8b92830e9ea402a65c0ffd6f73dce/base/src/main/java/proguard/classfile/util/MethodLinker.java#L150-L158

image

mrjameshamilton commented 1 day ago

The dictionary performance should be improved in 7.6 with the following commit: https://github.com/Guardsquare/proguard/commit/03d7effdd2be72db980814a44b745f99bbff4d2d