binaryeq / jcompile

scripts to compile Java projects with different compilers to create a data set of comparable binaries
Apache License 2.0
0 stars 0 forks source link

Force target bytecode version to match compiler major JDK version with `-Dmaven.compiler.debug=...` #93

Closed wtwhite closed 4 months ago

wtwhite commented 4 months ago

Resolves https://github.com/binaryeq/msr24/issues/16 (hopefully). I.e., hopefully enables us to see JEP181 NestMembers when compiling with JDK >= 11.

wtwhite commented 4 months ago

I've been running this for about half an hour and there are a lot of errors so far, many of which seem to be from an Enforcer plugin complaining about the versions:

wtwhite@wtwhite-vuw-vm:~/code/jcompile$ find jars -name '*.error'|wc -l
166
wtwhite@wtwhite-vuw-vm:~/code/jcompile$ find jars -name '*.jar'|wc -l
30
wtwhite@wtwhite-vuw-vm:~/code/jcompile$ grep RequireJavaVersion jars/*/*.error|wc -l
124
wtwhite@wtwhite-vuw-vm:~/code/jcompile$ grep -A1 RequireJavaVersion jars/*/*.error|head
jars/openjdk-8.0.302/bcel-6.4.0.jar.error:[WARNING] Rule 1: org.apache.maven.plugins.enforcer.RequireJavaVersion failed with message:
jars/openjdk-8.0.302/bcel-6.4.0.jar.error-Detected JDK Version: 1.8.0-302 is not in the allowed range 8.
--
jars/openjdk-8.0.302/bcel-6.4.1.jar.error:[WARNING] Rule 1: org.apache.maven.plugins.enforcer.RequireJavaVersion failed with message:
jars/openjdk-8.0.302/bcel-6.4.1.jar.error-Detected JDK Version: 1.8.0-302 is not in the allowed range 8.
--
jars/openjdk-8.0.302/bcel-6.5.0.jar.error:[WARNING] Rule 1: org.apache.maven.plugins.enforcer.RequireJavaVersion failed with message:
jars/openjdk-8.0.302/bcel-6.5.0.jar.error-Detected JDK Version: 1.8.0-302 is not in the allowed range 8.
--
jars/openjdk-8.0.302/bcel-6.6.0.jar.error:[ERROR] Rule 1: org.apache.maven.plugins.enforcer.RequireJavaVersion failed with message:

I'll try changing 8 to 1.8, etc.

wtwhite commented 4 months ago

Yes, that fixed it for building commons-codec-1.12 under JDK 8.0.372. Will change the rest.

wtwhite commented 4 months ago

A few minutes into rerunning and this seems to be working so far:

wtwhite@wtwhite-vuw-vm:~/code/jcompile$ ls -ltra jars/openjdk-8.0.372/*.jar
-rw-rw-r-- 1 wtwhite wtwhite 276342 Jul  2 14:29 jars/openjdk-8.0.372/commons-io-2.7.jar
-rw-rw-r-- 1 wtwhite wtwhite 465722 Jul  2 14:29 jars/openjdk-8.0.372/commons-io-2.7-tests.jar
-rw-rw-r-- 1 wtwhite wtwhite 285424 Jul  2 14:29 jars/openjdk-8.0.372/commons-io-2.8.0.jar
-rw-rw-r-- 1 wtwhite wtwhite 488700 Jul  2 14:29 jars/openjdk-8.0.372/commons-io-2.8.0-tests.jar
-rw-rw-r-- 1 wtwhite wtwhite 325260 Jul  2 14:29 jars/openjdk-8.0.372/commons-io-2.9.0.jar
-rw-rw-r-- 1 wtwhite wtwhite 631219 Jul  2 14:29 jars/openjdk-8.0.372/commons-io-2.9.0-tests.jar

These failed earlier:

wtwhite@wtwhite-vuw-vm:~/code/jcompile$ ls -ltra runs/35_try_forcing_compiler_version_aborted/jars/openjdk-8.0.372/commons-io-2.[789]*jar.error
-rw-rw-r-- 1 wtwhite wtwhite 2602 Jul  2 12:49 runs/35_try_forcing_compiler_version_aborted/jars/openjdk-8.0.372/commons-io-2.7.jar.error
-rw-rw-r-- 1 wtwhite wtwhite 2604 Jul  2 12:49 runs/35_try_forcing_compiler_version_aborted/jars/openjdk-8.0.372/commons-io-2.8.0.jar.error
-rw-rw-r-- 1 wtwhite wtwhite 2604 Jul  2 12:49 runs/35_try_forcing_compiler_version_aborted/jars/openjdk-8.0.372/commons-io-2.9.0.jar.error
wtwhite commented 4 months ago

Still going strong:

wtwhite@wtwhite-vuw-vm:~/code/jcompile$ find jars -name '*.error'|wc -l
37
wtwhite@wtwhite-vuw-vm:~/code/jcompile$ find jars -name '*.jar'|wc -l
279
wtwhite commented 4 months ago

It's very slow now

wtwhite@wtwhite-vuw-vm:~/code/jcompile$ find jars -name '*.done'|wc -l
1434

Stopped running this last evening around 7pm, started running it again around 8:30pm and let it run overnight -- but still not finished by 8:30am this morning -- only 1434 of 1792 projects processed (either to jar or error)! This used to take only 6 hours. Is all of the additional runtime from the 2 new calls to find ... | xargs detect-bytecode-features-OLDJEP181.pl ...? That seems unlikely, but it's the only change besides forcing the compiler version with -Dmaven.compiler.target=....

Animal Sniffer problems with JDK11

Even though the run still has 1792-1434=358 projects still to process, we already have far more errors than there were in the previous run:

wtwhite@wtwhite-vuw-vm:~/code/jcompile$ find jars -name '*.jar'|wc -l
806
wtwhite@wtwhite-vuw-vm:~/code/jcompile$ find jars -name '*.error'|wc -l
935
wtwhite@wtwhite-vuw-vm:~/code/jcompile$ find runs/34_with_eq_from_dot_and_new_jep181/jars/EQ -name '*.error'|wc -l
406

After restarting the run again with time make -j 3, I noticed that at least some of these errors seem to be the Animal Sniffer Maven plugin looking for a nonsensical version number Failed to execute goal org.codehaus.mojo:animal-sniffer-maven-plugin:1.22:check (checkAPIcompatibility) on project bcel: Failed to obtain signature: org.codehaus.mojo.signature:java110:1.0:

[INFO] --- animal-sniffer:1.22:check (checkAPIcompatibility) @ bcel ---
[INFO] Checking unresolved references to org.codehaus.mojo.signature:java110:1.0
[INFO] ------------------------------------------------------------------------
[INFO] BUILD FAILURE
[INFO] ------------------------------------------------------------------------
[INFO] Total time:  01:25 min
[INFO] Finished at: 2024-07-02T22:20:59Z
[INFO] ------------------------------------------------------------------------
[WARNING] 
[WARNING] Plugin validation issues were detected in 10 plugin(s)
[WARNING] 
[WARNING]  * org.apache.maven.plugins:maven-site-plugin:3.12.1
[WARNING]  * org.apache.maven.plugins:maven-resources-plugin:3.2.0
[WARNING]  * org.apache.felix:maven-bundle-plugin:5.1.8
[WARNING]  * org.codehaus.mojo:build-helper-maven-plugin:3.3.0
[WARNING]  * org.codehaus.mojo:buildnumber-maven-plugin:3.0.0
[WARNING]  * org.apache.maven.plugins:maven-compiler-plugin:3.10.1
[WARNING]  * org.codehaus.mojo:animal-sniffer-maven-plugin:1.22
[WARNING]  * org.apache.maven.plugins:maven-enforcer-plugin:3.1.0
[WARNING]  * org.apache.rat:apache-rat-plugin:0.15
[WARNING]  * org.apache.maven.plugins:maven-remote-resources-plugin:1.7.0
[WARNING] 
[WARNING] For more or less details, use 'maven.plugin.validation' property with one of the values (case insensitive): [BRIEF, DEFAULT, VERBOSE]
[WARNING] 
[ERROR] Failed to execute goal org.codehaus.mojo:animal-sniffer-maven-plugin:1.22:check (checkAPIcompatibility) on project bcel: Failed to obtain signature: org.codehaus.mojo.signature:java110:1.0: The following artifacts could not be resolved: org.codehaus.mojo.signature:java110:signature:1.0 (absent): org.codehaus.mojo.signature:java110:signature:1.0 was not found in https://repo.maven.apache.org/maven2 during a previous attempt. This failure was cached in the local repository and resolution is not reattempted until the update interval of central has elapsed or updates are forced
[ERROR] 
[ERROR] Try downloading the file manually from the project website.
[ERROR] 
[ERROR] Then, install it using the command: 
[ERROR]     mvn install:install-file -DgroupId=org.codehaus.mojo.signature -DartifactId=java110 -Dversion=1.0 -Dpackaging=signature -Dfile=/path/to/file
[ERROR] 
[ERROR] Alternatively, if you host your own repository you can deploy the file there: 
[ERROR]     mvn deploy:deploy-file -DgroupId=org.codehaus.mojo.signature -DartifactId=java110 -Dversion=1.0 -Dpackaging=signature -Dfile=/path/to/file -Durl=[url] -DrepositoryId=[id]
[ERROR] 
[ERROR] 
[ERROR]   org.codehaus.mojo.signature:java110:signature:1.0
[ERROR] 
[ERROR] from the specified remote repositories:
[ERROR]   apache.snapshots (https://repository.apache.org/snapshots, releases=false, snapshots=true),
[ERROR]   central (https://repo.maven.apac;34mINFO] Using 'UTF-8' encoding to copy filtered properties files.
[INFO] Copying 17 resources
[INFO] Copying 2 resources to META-INF
[INFO] 

Indeed, this error looks to be responsible for most of the build failures (and was not responsible for any in the previous run):

wtwhite@wtwhite-vuw-vm:~/code/jcompile$ grep 'animal-sniffer.*Failed to obtain signature' jars/*/*.error|wc -l
435
wtwhite@wtwhite-vuw-vm:~/code/jcompile$ grep 'animal-sniffer.*Failed to obtain signature' runs/34_with_eq_from_dot_and_new_jep181/jars/EQ/*/*.error|wc -l
0

Next question: Should we just turn Animal Sniffer off? @jensdietrich is this a good idea? I think so as it's basically a check, similar to running tests, which we already turn off. I'll look into how to do this next in any case.

wtwhite commented 4 months ago

Plan: Disable Animal Sniffer with -Danimal.sniffer.skip=true, get rid of AS-caused failures and continue the current run (time pressure). Also mention this in the paper.

Getting rid of just the AS-caused failures gets rid of about half of them:

wtwhite@wtwhite-vuw-vm:~/code/jcompile$ find jars -name '*.done'|wc -l
1025
wtwhite@wtwhite-vuw-vm:~/code/jcompile$ find jars -name '*.error'|wc -l
514
wtwhite@wtwhite-vuw-vm:~/code/jcompile$ find animal_sniffer_failures_removed_from_run35 -name '*.done'|wc -l
448
wtwhite@wtwhite-vuw-vm:~/code/jcompile$ find animal_sniffer_failures_removed_from_run35 -name '*.error'|wc -l
448
jensdietrich commented 4 months ago

as just discussed we can disable animal-sniffer similar to how we deal with tests, jrat etc

wtwhite commented 4 months ago

This marathon run finally completed. There are a lot more errors:

wtwhite@wtwhite-vuw-vm:~/code/jcompile$ find jars -name '*.done'|wc -l
1792
wtwhite@wtwhite-vuw-vm:~/code/jcompile$ find jars -name '*.jar'|wc -l
2120
wtwhite@wtwhite-vuw-vm:~/code/jcompile$ find jars -name '*.error'|wc -l
601

Compared to last time:

wtwhite@wtwhite-vuw-vm:~/code/jcompile$ find runs/34_with_eq_from_dot_and_new_jep181/jars/EQ -name '*.done'|wc -l
1792
wtwhite@wtwhite-vuw-vm:~/code/jcompile$ find runs/34_with_eq_from_dot_and_new_jep181/jars/EQ -name '*.jar'|wc -l
2492
wtwhite@wtwhite-vuw-vm:~/code/jcompile$ find runs/34_with_eq_from_dot_and_new_jep181/jars/EQ -name '*.error'|wc -l
406
wtwhite commented 4 months ago

Will merge this, and look at the issue of increased errors in a separate issue.

jensdietrich commented 4 months ago

Should this add up (#done = #jar + #error) ? I guess the difference is multiple jars produced during the build ?

I think if we go from 2,492 to 2,120 jars this is fine. We should have a look at new errors though. I assume that this is projects requiring a later JDK version (e.g. they use 11 or even 17 language features) and we are forcing them to compile with an earlier compiler version.

It would be good to confirm this, and perhaps do a light-weight analysis of the error messages.

For the paper, we will need to update those numbers in various places, this is a bit of a mission to keep this consistent. Perhaps we can break this down into tasks:

  1. recreate EQ-TSV
  2. recreate NEQ1
  3. recreate NEQ2
  4. push all updates to datasets
  5. run scripts to produce data reported in tables 3 and 4
  6. update paper with new data
  7. upload dataset to pcloud
  8. update pcloud link in paper

@wtwhite ping me if you want to jump on zoom to discuss further, I am free

wtwhite commented 4 months ago

Should this add up (#done = #jar + #error) ? I guess the difference is multiple jars produced during the build ?

Exactly @jensdietrich, #jar includes test jars. The "right" way to calculate the number of successful builds is #done - #error. So:

I think if we go from 2,492 to 2,120 jars this is fine. We should have a look at new errors though. I assume that this is projects requiring a later JDK version (e.g. they use 11 or even 17 language features) and we are forcing them to compile with an earlier compiler version.

It would be good to confirm this, and perhaps do a light-weight analysis of the error messages.

Agreed -- I'll look into this shortly. I'll also check for class files that "transition" from having no JEP181 NestMembers when built with JDK \< 11 to having it in JDK >= 11 (the main reason for this latest run).

For the paper, we will need to update those numbers in various places, this is a bit of a mission to keep this consistent. Perhaps we can break this down into tasks:

  1. recreate EQ-TSV
  2. recreate NEQ1
  3. recreate NEQ2
  4. push all updates to datasets
  5. run scripts to produce data reported in tables 3 and 4
  6. update paper with new data
  7. upload dataset to pcloud
  8. update pcloud link in paper

Will do. I have parts of this automated via make already, will try to automate remaining parts where it makes sense.

I had an idea for how to make paper updates less painful: What if, whenever we have a number that might change, instead of specifying it in-place in experiments.tex or whatever, you instead write, say, \input{results/foo_count} there, and then have a script that we can run to (re)generate a tiny file results/foo_count.tex containing just that plain number from the actual data This would even play nicely with make.