Closed wtwhite closed 11 months ago
Generating the .tsv
summaries of the JSON revapi
outputs doesn't take long:
wtwhite@wtwhite-vuw-vm:~/code/jcompile/runs/31_run_30_with_test_jars_stripped$ time make MINSEVERITY=POTENTIALLY_BREAKING -f ../../Makefile.revapi
/home/wtwhite/code/jcompile/summarise-revapi-json-to-tsv.sh < jars/ecj-3.11.1.v20150902-1521_openjdk-11.0.19/commons-codec-1.11__vs__commons-codec-1.12.revapi.POTENTIALLY_BREAKING.json > jars/ecj-3.11.1.v20150902-1521_openjdk-11.0.19/commons-codec-1.11__vs__commons-codec-1.12.revapi.POTENTIALLY_BREAKING.tsv
/home/wtwhite/code/jcompile/summarise-revapi-json-to-tsv.sh < jars/ecj-3.11.1.v20150902-1521_openjdk-11.0.19/commons-codec-1.12__vs__commons-codec-1.13.revapi.POTENTIALLY_BREAKING.json > jars/ecj-3.11.1.v20150902-1521_openjdk-11.0.19/commons-codec-1.12__vs__commons-codec-1.13.revapi.POTENTIALLY_BREAKING.tsv
/home/wtwhite/code/jcompile/summarise-revapi-json-to-tsv.sh < jars/ecj-3.11.1.v20150902-1521_openjdk-11.0.19/commons-codec-1.13__vs__commons-codec-1.14.revapi.POTENTIALLY_BREAKING.json > jars/ecj-3.11.1.v20150902-1521_openjdk-11.0.19/commons-codec-1.13__vs__commons-codec-1.14.revapi.POTENTIALLY_BREAKING.tsv
--snip--
real 0m41.686s
user 0m38.286s
sys 0m3.070s
wtwhite@wtwhite-vuw-vm:~/code/jcompile/runs/31_run_30_with_test_jars_stripped$ find . -name '*.revapi.POTENTIALLY_BREAKING.tsv'|wc -l
1414
They look reasonable. Lines can be duplicated many times, but this is not an issue:
wtwhite@wtwhite-vuw-vm:~/code/jcompile/runs/31_run_30_with_test_jars_stripped$ wc -l ./jars/openjdk-11.0.12/commons-configuration2-2.8.0-tests__vs__commons-configuration2-2.9.0-tests.revapi.POTENTIALLY_BREAKING.tsv
6273 ./jars/openjdk-11.0.12/commons-configuration2-2.8.0-tests__vs__commons-configuration2-2.9.0-tests.revapi.POTENTIALLY_BREAKING.tsv
wtwhite@wtwhite-vuw-vm:~/code/jcompile/runs/31_run_30_with_test_jars_stripped$ head !$
head ./jars/openjdk-11.0.12/commons-configuration2-2.8.0-tests__vs__commons-configuration2-2.9.0-tests.revapi.POTENTIALLY_BREAKING.tsv
org.apache.commons.configuration2.AbstractConfiguration POTENTIALLY_BREAKING POTENTIALLY_BREAKING -
org.apache.commons.configuration2.AbstractConfiguration POTENTIALLY_BREAKING POTENTIALLY_BREAKING -
org.apache.commons.configuration2.BaseConfiguration POTENTIALLY_BREAKING POTENTIALLY_BREAKING -
org.apache.commons.configuration2.BaseConfiguration POTENTIALLY_BREAKING POTENTIALLY_BREAKING -
org.apache.commons.configuration2.BaseHierarchicalConfiguration POTENTIALLY_BREAKING POTENTIALLY_BREAKING -
org.apache.commons.configuration2.BaseHierarchicalConfiguration POTENTIALLY_BREAKING POTENTIALLY_BREAKING -
org.apache.commons.configuration2.CompositeConfiguration POTENTIALLY_BREAKING POTENTIALLY_BREAKING -
org.apache.commons.configuration2.CompositeConfiguration POTENTIALLY_BREAKING POTENTIALLY_BREAKING -
org.apache.commons.configuration2.Configuration POTENTIALLY_BREAKING POTENTIALLY_BREAKING -
org.apache.commons.configuration2.Configuration POTENTIALLY_BREAKING POTENTIALLY_BREAKING -
wtwhite@wtwhite-vuw-vm:~/code/jcompile/runs/31_run_30_with_test_jars_stripped$ uniq ./jars/openjdk-11.0.12/commons-configuration2-2.8.0-tests__vs__commons-configuration2-2.9.0-tests.revapi.POTENTIALLY_BREAKING.tsv|wc -l
257
Next step: Change PreprocessedJsonRevApiJarComparer
to actually read these results.
What does POTENTIALLY_BREAKING
mean here ? AFAIK revapi reports look like this:
Old API: easycrud-1.0.0.jar
New API: easycrud-1.1.0.jar
old: field nz.ac.vuw.jenz.easycrud.PersistencyService.VERSION
new: field nz.ac.vuw.jenz.easycrud.PersistencyService.VERSION
java.field.constantValueChanged: Constant field changed value from '1.0.0' to '1.1.0'.
SEMANTIC: BREAKING, BINARY: NON_BREAKING, SOURCE: NON_BREAKING
So we could add columns as follows:
old_location (example: field nz.ac.vuw.jenz.easycrud.PersistencyService.VERSION) new_location (example: field nz.ac.vuw.jenz.easycrud.PersistencyService.VERSION) change (example: java.field.constantValueChanged: Constant field changed value from '1.0.0' to '1.1.0'.) SEMANTIC_COMPATIBLE (boolean -- BREAKING means no, otherwise yes) BINARY_COMPATIBLE (boolean -- BREAKING means no, otherwise yes) SOURCE_COMPATIBLE (boolean -- BREAKING means no, otherwise yes)
SEMANTIC does not appear by default in results, if absent, set SEMANTIC_COMPATIBLE to true.
What does
POTENTIALLY_BREAKING
mean here ?
@jensdietrich AFAICT that is just a (maybe new?) severity level that revapi gives to changes that are very unlikely to cause breakage, but potentially could. Their docs don't say much:
POTENTIALLY_BREAKING
- the difference may break the API compatibility (of given type) under some specific circumstancesAFAIK revapi reports look like this:
I'm using their JSON report format, which is the same info but easier for parsing.
change (example: java.field.constantValueChanged: Constant field changed value from '1.0.0' to '1.1.0'.)
Question: We can get multiple results per class (even more than one per method) -- how to combine them? Maybe just (deterministically) choose a single representative one? Or skip this column altogether?
SEMANTIC_COMPATIBLE (boolean -- BREAKING means no, otherwise yes) BINARY_COMPATIBLE (boolean -- BREAKING means no, otherwise yes) SOURCE_COMPATIBLE (boolean -- BREAKING means no, otherwise yes)
SEMANTIC does not appear by default in results, if absent, set SEMANTIC_COMPATIBLE to true.
Sounds good, will do.
Ran on a small test dataset:
wtwhite@wtwhite-vuw-vm:~/code/jcompile/oracle-construction$ time java -cp target/jcompile.jar nz.ac.wgtn.shadedetector.jcompile.oracles.AdjacentVersionSameArtifactAndCompilerClassOracle fixed_small_jars_with_tests > fixed_small_jars_with_tests_AdjacentVersionSameArtifactAndCompiler_perpairrealdata.txt
analysing: fixed_small_jars_with_tests/openjdk-11.0.19/bcel-6.4.0.jar vs fixed_small_jars_with_tests/openjdk-11.0.19/bcel-6.4.1.jar
analysing: fixed_small_jars_with_tests/openjdk-11.0.19/bcel-6.4.1.jar vs fixed_small_jars_with_tests/openjdk-11.0.19/bcel-6.5.0.jar
analysing: fixed_small_jars_with_tests/openjdk-11.0.19/bcel-6.5.0.jar vs fixed_small_jars_with_tests/openjdk-11.0.19/bcel-6.6.0.jar
analysing: fixed_small_jars_with_tests/openjdk-11.0.19/bcel-6.6.0.jar vs fixed_small_jars_with_tests/openjdk-11.0.19/bcel-6.6.1.jar
analysing: fixed_small_jars_with_tests/openjdk-11.0.19/bcel-6.6.1.jar vs fixed_small_jars_with_tests/openjdk-11.0.19/bcel-6.7.0.jar
analysing: fixed_small_jars_with_tests/openjdk-11.0.19/bcel-6.4.0-tests.jar vs fixed_small_jars_with_tests/openjdk-11.0.19/bcel-6.4.1-tests.jar
analysing: fixed_small_jars_with_tests/openjdk-11.0.19/bcel-6.4.1-tests.jar vs fixed_small_jars_with_tests/openjdk-11.0.19/bcel-6.5.0-tests.jar
--snip--
real 0m53.633s
user 1m3.346s
sys 0m5.960s
Results look good, but are dominated by classes for which revapi
reported no information:
wtwhite@wtwhite-vuw-vm:~/code/jcompile/oracle-construction$ cut -f27- < fixed_small_jars_with_tests_AdjacentVersionSameArtifactAndCompiler_perpairrealdata.txt | sort|uniq -c
18570 - - -
386 BREAKING BREAKING -
3 BREAKING BREAKING EQUIVALENT
421 BREAKING BREAKING POTENTIALLY_BREAKING
323 BREAKING NON_BREAKING -
345 BREAKING NON_BREAKING POTENTIALLY_BREAKING
13 BREAKING POTENTIALLY_BREAKING -
18 BREAKING POTENTIALLY_BREAKING POTENTIALLY_BREAKING
6 EQUIVALENT EQUIVALENT BREAKING
1531 EQUIVALENT EQUIVALENT POTENTIALLY_BREAKING
27 NON_BREAKING BREAKING -
17 NON_BREAKING NON_BREAKING BREAKING
11 NON_BREAKING NON_BREAKING POTENTIALLY_BREAKING
6 POTENTIALLY_BREAKING BREAKING -
28 POTENTIALLY_BREAKING EQUIVALENT -
1 POTENTIALLY_BREAKING EQUIVALENT POTENTIALLY_BREAKING
42 POTENTIALLY_BREAKING NON_BREAKING -
141 POTENTIALLY_BREAKING POTENTIALLY_BREAKING -
2 POTENTIALLY_BREAKING POTENTIALLY_BREAKING BREAKING
100 POTENTIALLY_BREAKING POTENTIALLY_BREAKING POTENTIALLY_BREAKING
1 source_compatibility binary_compatibility semantic_compatibility
Showing compatibility data as boolean BREAKING
-or-not columns as suggested by @jensdietrich on the test dataset:
wtwhite@wtwhite-vuw-vm:~/code/jcompile/oracle-construction$ cut -f27- < fixed_small_jars_with_tests_AdjacentVersionSameArtifactAndCompiler_perpairrealdata_boolcompat.txt | sort|uniq -c
810 false false true
699 false true true
1 source_compatible binary_compatible semantic_compatible
33 true false true
25 true true false
20424 true true true
Idea: Rather than eagerly discard rows from NEQ1 for which
revapi
reports no breaking change, initially let's just add extra columns describing therevapi
results. This allows end-users to decide which rows they are interested in using.Thoughts @jensdietrich?