leibnitz27 / cfr

This is the public repository for the CFR Java decompiler
https://www.benf.org/other/cfr
MIT License
1.93k stars 249 forks source link

Add initial decompilation test proof of concept #288

Closed Marcono1234 closed 2 years ago

Marcono1234 commented 2 years ago

Proof of concept for performing automatic decompilation tests. This should hopefully help with performing regression tests and to get some (indirect) test coverage. The code for the test could probably be improved and cleaned up a bit more, but it seems to work for this proof of concept.

Overview:

There are however still a few open questions / issues:

Any feedback is appreciated! (even if you don't want to integrate this or something similar)

Marcono1234 commented 2 years ago

@leibnitz27, what do you think about this approach? Do you have any requests for improvements or changes? The current test failure is 'intentional' to showcase how unexpected decompilation results would look like.

Before I address any of the open points in the description it would be good to get your general opinion on this idea.

leibnitz27 commented 2 years ago

Hey - sorry not been able to focus much on this project for a bit, trying to get back into it now; will go over this at the weekend - cheers!

leibnitz27 commented 2 years ago

first blush - this seems very cool \o/ - I will probably be able to answer these questions myself after reading your changes/commentary more thoroughly

why is expected output in the main repo - doesn't that naturally belong better in the submodule?

expected output seems to only have 1 version - it's pretty important to compile the increasing subsets of the tests on increasing versions of java (so you handle both the stringbuilder and the j9+ way of doing strings, for example).

Most environment data can be elided - for example the version number is currently dropped from output in existing crappy way of doing tests.. ;)

Mostly happy with the idea of checking binary class files in to avoid making tests more painful, but it does mean that a lot of spurious changes will occur frequently - I may have missed if there's a step to create these as part of a workflow? If so, how to do it once for each version of javac? (also not generally feasible I suspect as github is unlikely to expose old/crazy versions of javac).

more anon....

Marcono1234 commented 2 years ago

why is expected output in the main repo - doesn't that naturally belong better in the submodule?

There are the following reasons:

expected output seems to only have 1 version - it's pretty important to compile the increasing subsets of the tests on increasing versions of java

With this pull request you would probably implement this by having version specific subdirectories in the test resources repository, e.g. classes/9/StringConcat.class. However, these .class files would have to be compiled manually, and should probably only created where they are interesting. E.g. when the way javac compiles specific code has not changed much or at all between versions, then there is probably no point in having the .class files for different versions because that would only slow down tests without adding much value.

I may have missed if there's a step to create these as part of a workflow?

No, currently I have no plans for creating the .class files in an automated way. However, the .class and .jar files would probably only be a subset of https://github.com/leibnitz27/cfr_tests (because there is some duplication in cfr_tests). This decompilation test approach is mainly intended to give some confidence that ones changes aren't breaking major parts of CFR, and to see their effects. It won't be a complete replacement for cfr_tests.

leibnitz27 commented 2 years ago

BTW I'm definitely sold on this, it's great.

  • Changes to CFR might cause test failures due to different decompiled output. Having to first adapt it in the separate test resources repository, wait for it to be merged and then update the CFR pull request would slow down integration of changes.
    • The expected output is specific to CFR. If the separate test resources repository is intended to be decompiler agnostic, then that specific output should probably not be contained there.

Yeah, both really good reasons.

However, these .class files would have to be compiled manually

At this point it feels like the cfr-tests repo should just get a compiled version of all the tests (pretty much the result of running the batch file in there atm); leaves the minor downside of churn as I said but... eh ;) \

No, currently I have no plans for creating the .class files in an automated way. ... It won't be a complete replacement for cfr_tests.

I'm a bit nervous about having a curated set of tests as the 'real' tests tend to end up being these, perhaps a curated exclusion list to ignore known (semi) pointless ones, or the horrifically expensive ones (the horrible exception test is nice, but always annoys me). (or maybe an inclusion list to start with, and flip to exclusion later).


I'll shove a full set of the compiled stuff onto cfr-tests master, then we can decide where we want to go - either you tweak this PR, or I cherry pick and tweak - up to you since you've done all the work ;)

Marcono1234 commented 2 years ago

Thanks for the feedback! I have implemented my TODO features listed in the description of this pull request and extended the documentation. It takes up the majority of the README now, so maybe it should be moved to a separate Markdown file instead?

For StructuredFakeDecompFailure I added a CFR option specifying whether to dump exception stack traces (dumpexceptionstacktrace); which is enabled by default but is disabled for these tests. Is that alright for you? Side note: This option might also be useful for different use cases, e.g. for projects which compile multiple versions of a JAR and diff the changes. For those projects avoiding diffs in the stack trace for decompilation failures when they change the way they invoke the CFR API (or when the call stack in CFR itself changes between versions) would be useful as well.

@ClassFileTestDataSource and @JarTestDataSource which are used in DecompilationTest now both support specifying glob patterns for class files or JARs to ignore. Hopefully that works fine for your intended usage. If not I can also try implementing an 'inclusion list' you proposed. Though in the end, an 'exclusion list' like the current one might be a bit less error-prone.

It looks like the Maven Surefire Plugin does not use the name value for the tests in the console output, which makes it difficult (or impossible) to tell for which class file or JAR a test failed. Not sure if that is a bug with the plugin, or if it has to be configured in some way. IDEs (at least Eclipse) show the test name just fine. Alternatively JUnit dynamic tests could be used, but I am not sure if changing the implementation is worth it at the moment.

Also note that the tests are compiled with Java 8. For Maven that seems to work without issues but IDEs might show spurious compilation errors because they think the test sources are compiled with Java 6 as well.

leibnitz27 commented 2 years ago

Hey - I've cherry picked your stuff here onto another branch and merged it with some small (and some less small) tweaks.

Because (you're right) ideally cfr-tests shouldn't be tied to cfr specifically ( in a wonderful ideal world, hah ), rather than drive which things get tested from the presence of a file, I've done it backwards, from a bit of (pretty badly implemented ;) ) xml config. (couldn't bring myself to add yaml here).

I've removed the jar stuff (for now) but will add back in, want to think about that a bit more in the presence of testing different module configurations.

Sorry for hacking your lovely code with grossness, at least with cherry picks it's clear which bits I mauled ;)

Marcono1234 commented 2 years ago

Sorry for hacking your lovely code with grossness, at least with cherry picks it's clear which bits I mauled ;)

It is fine; thank you for including it in the first place. Your XML test specification approach definitely works better for picking only specific classes to decompile. You might have to update the README though, it currently still refers to the separate .options file and does not mention the XML test specification file yet.