binaryeq / jcompile

scripts to compile Java projects with different compilers to create a data set of comparable binaries
Apache License 2.0
0 stars 0 forks source link

Run detect-bytecode-features.sh on each .class file #51

Closed wtwhite closed 11 months ago

wtwhite commented 11 months ago

Resolves #50.

wtwhite commented 11 months ago

Unfortunately this is extremely slow -- 13 minutes for a single project:

wtwhite@wtwhite-vuw-vm:~/code/jcompile$ time make jars/openjdk-11.0.16/checkstyle-10.12.3.jar.done 
--snip--
SUCCESS! - copying /target/checkstyle-10.12.3.jar  into jars/openjdk-11.0.16
openjdk-11.0.16__pid74668
openjdk-11.0.16__pid74668

================================================

real    13m19.821s
user    29m23.801s
sys 2m47.454s

Previously the time was just 21s!

wtwhite@wtwhite-vuw-vm:~/code/jcompile$ time make jars/openjdk-11.0.16/checkstyle-10.12.3.jar.done 
--snip--
SUCCESS! - copying /target/checkstyle-10.12.3.jar  into jars/openjdk-11.0.16
openjdk-11.0.16__pid164149
openjdk-11.0.16__pid164149

================================================

real    0m21.052s
user    0m0.283s
sys 0m0.268s
wtwhite commented 11 months ago

There are 1760 .class files in this project, meaning each takes about 0.4s to check. This seems roughly about right based on timing the first 10:

wtwhite@wtwhite-vuw-vm:~/code/jcompile$ for f in `find worktrees/pid164574-checkstyle-10.12.3/target/classes -type f -name '*.class'|head`; do echo $f; time javap -c -v $f > /dev/null ; done
worktrees/pid164574-checkstyle-10.12.3/target/classes/com/puppycrawl/tools/checkstyle/ConfigurationLoader.class

real    0m0.374s
user    0m0.720s
sys 0m0.089s
worktrees/pid164574-checkstyle-10.12.3/target/classes/com/puppycrawl/tools/checkstyle/AbstractAutomaticBean$ScopeConverter.class

real    0m0.223s
user    0m0.472s
sys 0m0.045s
worktrees/pid164574-checkstyle-10.12.3/target/classes/com/puppycrawl/tools/checkstyle/ant/CheckstyleAntTask$Property.class

real    0m0.212s
user    0m0.498s
sys 0m0.014s
worktrees/pid164574-checkstyle-10.12.3/target/classes/com/puppycrawl/tools/checkstyle/ant/CheckstyleAntTask$Formatter.class

real    0m0.211s
user    0m0.476s
sys 0m0.025s
worktrees/pid164574-checkstyle-10.12.3/target/classes/com/puppycrawl/tools/checkstyle/ant/CheckstyleAntTask.class

real    0m0.297s
user    0m0.697s
sys 0m0.067s
worktrees/pid164574-checkstyle-10.12.3/target/classes/com/puppycrawl/tools/checkstyle/ant/package-info.class

real    0m0.185s
user    0m0.401s
sys 0m0.038s
worktrees/pid164574-checkstyle-10.12.3/target/classes/com/puppycrawl/tools/checkstyle/ant/CheckstyleAntTask$FormatterType.class

real    0m0.193s
user    0m0.441s
sys 0m0.030s
worktrees/pid164574-checkstyle-10.12.3/target/classes/com/puppycrawl/tools/checkstyle/AbstractAutomaticBean$SeverityLevelConverter.class

real    0m0.191s
user    0m0.421s
sys 0m0.040s
worktrees/pid164574-checkstyle-10.12.3/target/classes/com/puppycrawl/tools/checkstyle/PackageObjectFactory$ModuleLoadOption.class

real    0m0.180s
user    0m0.436s
sys 0m0.024s
worktrees/pid164574-checkstyle-10.12.3/target/classes/com/puppycrawl/tools/checkstyle/XpathFileGeneratorAuditListener.class

real    0m0.216s
user    0m0.494s
sys 0m0.026s

Fortunately javap can take multiple classes on its command line, the results are (1) identical and (2) easily separated by looking for lines starting with Classfile, and we get a massive speed improvement (~10x on 10 inputs, so it looks to be all startup time):

wtwhite@wtwhite-vuw-vm:~/code/jcompile$ for f in `find worktrees/pid164574-checkstyle-10.12.3/target/classes -type f -name '*.class'|head`; do echo $f; time javap -c -v $f >> all_results_separately.txt ; done
--snip--
wtwhite@wtwhite-vuw-vm:~/code/jcompile$ time javap -c -v `find worktrees/pid164574-checkstyle-10.12.3/target/classes -type f -name '*.class'|head` > all_results.txt

real    0m0.374s
user    0m0.961s
sys 0m0.097s
wtwhite@wtwhite-vuw-vm:~/code/jcompile$ md5sum all_results*
9a31fa7afeb9dfe2371e2775f2498483  all_results_separately.txt
9a31fa7afeb9dfe2371e2775f2498483  all_results.txt

So the solution is just to run javap on as many classes as possible in one go, then separate out the results.

wtwhite commented 11 months ago

The new approach (yes, Perl... this is where Perl shines) gives the same result and is much faster:

wtwhite@wtwhite-vuw-vm:~/code/jcompile$ time for d in bytecode_features/*/; do ./detect-bytecode-features.pl JEP181 ${d%%/}/JEP181.class; done 
bytecode_features/ecj-4.29-JDK11/JEP181.class
bytecode_features/ecj-4.29-JDK1.2/JEP181.class  JEP181
bytecode_features/ecj-4.29-JDK17/JEP181.class
bytecode_features/ecj-4.29-JDK1.8/JEP181.class  JEP181
bytecode_features/JDK11/JEP181.class
bytecode_features/JDK1.2/JEP181.class   JEP181
bytecode_features/JDK17/JEP181.class
bytecode_features/JDK8/JEP181.class JEP181

real    0m1.284s
user    0m2.649s
sys 0m0.298s
wtwhite@wtwhite-vuw-vm:~/code/jcompile$ time ./detect-bytecode-features.pl JEP181 bytecode_features/*/JEP181.class
bytecode_features/ecj-4.29-JDK11/JEP181.class
bytecode_features/ecj-4.29-JDK1.2/JEP181.class  JEP181
bytecode_features/ecj-4.29-JDK17/JEP181.class
bytecode_features/ecj-4.29-JDK1.8/JEP181.class  JEP181
bytecode_features/JDK11/JEP181.class
bytecode_features/JDK1.2/JEP181.class   JEP181
bytecode_features/JDK17/JEP181.class
bytecode_features/JDK8/JEP181.class JEP181

real    0m0.258s
user    0m0.471s
sys 0m0.064s

Likewise for JEP280:

wtwhite@wtwhite-vuw-vm:~/code/jcompile$ time for d in bytecode_features/*/; do ./detect-bytecode-features.pl JEP280 ${d%%/}/JEP280.class; done 
bytecode_features/ecj-4.29-JDK11/JEP280.class   JEP280
bytecode_features/ecj-4.29-JDK1.2/JEP280.class
bytecode_features/ecj-4.29-JDK17/JEP280.class   JEP280
bytecode_features/ecj-4.29-JDK1.8/JEP280.class
bytecode_features/JDK11/JEP280.class    JEP280
bytecode_features/JDK1.2/JEP280.class
bytecode_features/JDK17/JEP280.class    JEP280
bytecode_features/JDK8/JEP280.class

real    0m1.297s
user    0m2.750s
sys 0m0.276s
wtwhite@wtwhite-vuw-vm:~/code/jcompile$ time ./detect-bytecode-features.pl JEP280 bytecode_features/*/JEP280.class
bytecode_features/ecj-4.29-JDK11/JEP280.class   JEP280
bytecode_features/ecj-4.29-JDK1.2/JEP280.class
bytecode_features/ecj-4.29-JDK17/JEP280.class   JEP280
bytecode_features/ecj-4.29-JDK1.8/JEP280.class
bytecode_features/JDK11/JEP280.class    JEP280
bytecode_features/JDK1.2/JEP280.class
bytecode_features/JDK17/JEP280.class    JEP280
bytecode_features/JDK8/JEP280.class

real    0m0.218s
user    0m0.424s
sys 0m0.049s

The pathnames given on the command line must be suffixes of the pathnames appearing inside javap output, hence the need for ${d%%/} above.

wtwhite commented 11 months ago

Simplified to always detect all features instead of passing a feature on the command line. This avoids multiple javap runs:

wtwhite@wtwhite-vuw-vm:~/code/jcompile$ time ./detect-bytecode-features.pl bytecode_features/*/JEP*.class
bytecode_features/ecj-4.29-JDK11/JEP181$Inner.class
bytecode_features/ecj-4.29-JDK11/JEP181.class
bytecode_features/ecj-4.29-JDK11/JEP280.class   JEP280
bytecode_features/ecj-4.29-JDK1.2/JEP181$Inner.class    JEP181
bytecode_features/ecj-4.29-JDK1.2/JEP181.class  JEP181
bytecode_features/ecj-4.29-JDK1.2/JEP280.class
bytecode_features/ecj-4.29-JDK17/JEP181$Inner.class
bytecode_features/ecj-4.29-JDK17/JEP181.class
bytecode_features/ecj-4.29-JDK17/JEP280.class   JEP280
bytecode_features/ecj-4.29-JDK1.8/JEP181$Inner.class    JEP181
bytecode_features/ecj-4.29-JDK1.8/JEP181.class  JEP181
bytecode_features/ecj-4.29-JDK1.8/JEP280.class
bytecode_features/JDK11/JEP181$Inner.class
bytecode_features/JDK11/JEP181.class
bytecode_features/JDK11/JEP280.class    JEP280
bytecode_features/JDK1.2/JEP181$Inner.class JEP181
bytecode_features/JDK1.2/JEP181.class   JEP181
bytecode_features/JDK1.2/JEP280.class
bytecode_features/JDK17/JEP181$Inner.class
bytecode_features/JDK17/JEP181.class
bytecode_features/JDK17/JEP280.class    JEP280
bytecode_features/JDK8/JEP181$Inner.class   JEP181
bytecode_features/JDK8/JEP181.class JEP181
bytecode_features/JDK8/JEP280.class

real    0m0.276s
user    0m0.559s
sys 0m0.113s
wtwhite commented 11 months ago

Much faster!

wtwhite@wtwhite-vuw-vm:~/code/jcompile$ time make jars/openjdk-11.0.16/checkstyle-10.12.3.jar.done 
--snip--
SUCCESS! - copying /target/checkstyle-10.12.3.jar  into jars/openjdk-11.0.16
openjdk-11.0.16__pid173160
openjdk-11.0.16__pid173160

================================================

real    0m20.409s
user    0m10.942s
sys 0m1.454s

The results look sensible:

wtwhite@wtwhite-vuw-vm:~/code/jcompile$ find jars -ls
 27833802      4 drwxrwxr-x   3 wtwhite  wtwhite      4096 Nov 28 18:46 jars
 27833811      4 drwxrwxr-x   2 wtwhite  wtwhite      4096 Nov 28 18:46 jars/openjdk-11.0.16
 27833828   2036 -rw-rw-r--   1 wtwhite  wtwhite   2076902 Nov 28 18:46 jars/openjdk-11.0.16/checkstyle-10.12.3.jar
 27833829      4 -rw-rw-r--   1 wtwhite  wtwhite      1959 Nov 28 18:46 jars/openjdk-11.0.16/checkstyle-10.12.3.jar.generated-sources
 27833830    176 -rw-rw-r--   1 wtwhite  wtwhite    179326 Nov 28 18:46 jars/openjdk-11.0.16/checkstyle-10.12.3.jar.bytecode-features
 27833831      0 -rw-rw-r--   1 wtwhite  wtwhite         0 Nov 28 18:46 jars/openjdk-11.0.16/checkstyle-10.12.3.jar.done
wtwhite@wtwhite-vuw-vm:~/code/jcompile$ wc -l jars/openjdk-11.0.16/checkstyle-10.12.3.jar.bytecode-features
1760 jars/openjdk-11.0.16/checkstyle-10.12.3.jar.bytecode-features
wtwhite@wtwhite-vuw-vm:~/code/jcompile$ head !$
head jars/openjdk-11.0.16/checkstyle-10.12.3.jar.bytecode-features
target/classes/com/puppycrawl/tools/checkstyle/AbstractAutomaticBean$OutputStreamOptions.class
target/classes/com/puppycrawl/tools/checkstyle/AbstractAutomaticBean$PatternConverter.class
target/classes/com/puppycrawl/tools/checkstyle/AbstractAutomaticBean$RelaxedAccessModifierArrayConverter.class
target/classes/com/puppycrawl/tools/checkstyle/AbstractAutomaticBean$RelaxedStringArrayConverter.class
target/classes/com/puppycrawl/tools/checkstyle/AbstractAutomaticBean$ScopeConverter.class
target/classes/com/puppycrawl/tools/checkstyle/AbstractAutomaticBean$SeverityLevelConverter.class
target/classes/com/puppycrawl/tools/checkstyle/AbstractAutomaticBean$UriConverter.class
target/classes/com/puppycrawl/tools/checkstyle/AbstractAutomaticBean.class  JEP280
target/classes/com/puppycrawl/tools/checkstyle/ant/CheckstyleAntTask$Formatter.class
target/classes/com/puppycrawl/tools/checkstyle/ant/CheckstyleAntTask$FormatterType.class
wtwhite@wtwhite-vuw-vm:~/code/jcompile$ grep JEP181 jars/openjdk-11.0.16/checkstyle-10.12.3.jar.bytecode-features|wc -l
0
wtwhite@wtwhite-vuw-vm:~/code/jcompile$ grep JEP280 jars/openjdk-11.0.16/checkstyle-10.12.3.jar.bytecode-features|wc -l
200

(No JEP181 features since we're compiling with JDK 11.)