rpau / javalang-compiler

Java compiler elements (symbol and type tables) to perform code semantic analysis
GNU Lesser General Public License v3.0
10 stars 4 forks source link

scan jars only once #27

Closed cal101 closed 7 years ago

cal101 commented 7 years ago

The last change to the pull request fixed the functional issues but re-introduces some duplicate jar scannings. With this change each jar is only scanned once. With rawclasspath loader class caching being active in addition that makes a difference of

before: Elapsed (wall clock) time (h:mm:ss or m:ss): 5:50.59 after: Elapsed (wall clock) time (h:mm:ss or m:ss): 2:59.23

with late JarFile creation: Elapsed (wall clock) time (h:mm:ss or m:ss): 2:30.30

Another 15 seconds could be saved by caching File.isDirectory and File.canRead calls. (And this is on a linux system on a SSD!) I did not commit this change because a Map<String,Integer> feels too ugly.

The possibly big static cache in TypesLoadingVisitor is a code smell. Is there a way to extract the jar file scanning into some service class that is spring loaded and injected, passed or looked up by the visitor?

rpau commented 7 years ago

Good performance numbers! 👍

Yes, I do not like too much neither having a cache as an static member. The project walkmod-javalang-plugin allows to configure spring components.

TypesLoadingVisitor is created in several places to use it without walkmod (and then, without spring). It is necessary to check the usages and see if there is a way to inject that cache from the walkmod-javalang-plugin.

rpau commented 7 years ago

About the File.isDirectory and File.canRead calls, we can maybe use a bloom filter (guava has one implementation).

cal101 commented 7 years ago

Bloom filters will error in one direction or the other for these boolean flags and we need exact results for both states. The number of entries (for each jar and each directory one entry) in the map will be small and so the overhead of the map would be acceptable. Conceptually its a Map<File.absolutePath,Enum{isDirectory,canRead}>. I would like to use an StringIntMap on this but as said memory should not be a problem here. Since you already merged the current state I try to make the code more pretty and show it to you.