typetools / checker-framework

Pluggable type-checking for Java
http://checkerframework.org/
Other
1.02k stars 355 forks source link

wpi.sh is broken at multiple levels #4693

Open joachim-durchholz-six opened 3 years ago

joachim-durchholz-six commented 3 years ago

When trying to run wpi.sh, I hit the following problems, in order:

It seems like WPI is in a state of bitrot, and not currently usable; I tried to make it work but ran out of time, and cannot even do the analysis required to make a repeatable error report. (Sorry for that, feel free to close this report if you cannot work with it).

Please find below two of the commands I tried and the associated error messages.

$ (export JAVA_HOME=target/downloads/zulu11.41.24-sa-jdk11.0.8-win_i686; export CHECKERFRAMEWORK=checker-framework-3.13.0; $CHECKERFRAMEWORK/checker/bin/wpi.sh -- --checker guieffect)
Starting wpi.sh. The output of this script is purely informational.
checker-framework-3.13.0/checker/bin/wpi.sh: line 217: ./gradlew: No such file or directory
In directory /cygdrive/c/Users/tkd6u/projects/id-gui :
(cd /cygdrive/c/Users/tkd6u/projects/id-gui/checker-framework-3.13.0/checker/bin/.do-like-javac && git pull -q)
Done in directory /cygdrive/c/Users/tkd6u/projects/id-gui/checker-framework-3.13.0/checker/bin/.do-like-javac : 9bc789deeea23183f69688bb9ee74403ad366355
Finished configuring wpi.sh.
=== DLJC standard out/err follows: ===
WORKING DIR: /cygdrive/c/Users/tkd6u/projects/id-gui
JAVA_HOME: target/downloads/zulu11.41.24-sa-jdk11.0.8-win_i686
PATH: target/downloads/zulu11.41.24-sa-jdk11.0.8-win_i686/bin:/cygdrive/c/Program Files/Zulu/zulu-11/bin:/cygdrive/c/develop/software:/usr/local/bin:/usr/bin
DLJC_CMD: /cygdrive/c/Users/tkd6u/projects/id-gui/checker-framework-3.13.0/checker/bin/.do-like-javac/dljc -t wpi --jdkVersion 11 '--checker' 'guieffect' -- ./mvnw clean compile -Djava.home=target/downloads/zulu11.41.24-sa-jdk11.0.8-win_i686
usage: dljc [-o <directory>] [--log_to_stderr] [-t <tool>]
            [--timeout <seconds>] [--guess] [--quiet] [--cache] [-c <checker>]
            [--stubs <stubs>] [-l <lib_dir>] [--jdkVersion <jdkVersion>]
            [--quals <quals>] [--extraJavacArgs <extraJavacArgs>]
            [-s <solver>] [-afud <afud>] [-m <mode>]
            [-solverArgs <solverArgs>] [-cfArgs <cfArgs>]
            [--graph-jar <graphtool-jar>] [-X] [-- <cmd>]

global arguments:
  -o <directory>, --out <directory>
                        The directory to log results.
  --log_to_stderr       Redirect log messages to stderr instead of log file
  -t <tool>, --tool <tool>
                        A comma separated list of tools to run. Valid tools:
                        checker, wpi, inference, print, randoop, randoop_old,
                        bixie, graphtool, chicory, dyntrace, dyntracecounts
  --timeout <seconds>   The maximum time to run any subcommand.
  --guess               Guess source files if not present in build output.
  --quiet               Suppress output from subcommands.
  --cache               Use the dljc cache (if available)
  -c <checker>, --checker <checker>
                        A checker to check (for checker/inference tools)
  --stubs <stubs>       Location of stub files to use for the Checker
                        Framework
  -l <lib_dir>, --lib <lib_dir>
                        Library directory with JARs for tools that need them.
  --jdkVersion <jdkVersion>
                        Version of the JDK to use with the Checker Framework.
  --quals <quals>       Path to custom annotations to put on the classpath
                        when using the Checker Framework.
  --extraJavacArgs <extraJavacArgs>
                        List of extra arguments to pass to javac when running
                        a Checker Framework checker. Use this for arguments
                        that are only needed when running a checker, such as
                        -AassumeSideEffectFree.

inference tool arguments:
  -s <solver>, --solver <solver>
                        solver to use on constraints
  -afud <afud>, --afuOutputDir <afud>
                        Annotation File Utilities output directory
  -m <mode>, --mode <mode>
                        Modes of operation: TYPECHECK, INFER,
                        ROUNDTRIP,ROUNDTRIP_TYPECHECK
  -solverArgs <solverArgs>, --solverArgs <solverArgs>
                        arguments for solver
  -cfArgs <cfArgs>, --cfArgs <cfArgs>
                        arguments for checker framework

graphtool arguments:
  --graph-jar <graphtool-jar>
                        Path to prog2dfg.jar or apilearner.jar

dyntrace arguments:
  -X, --daikon-xml      Have Daikon emit XML

supported compiler/build-system commands:
  -- <cmd>              Command to run the compiler/build-system. Supported
                        build commands: ant, gradle, gradlew, javac, mvn
=== End of DLJC standard out/err.  ===
dljc failed
dljc output is in /cygdrive/c/Users/tkd6u/projects/id-gui/dljc-out/
stdout is in /cygdrive/c/Users/tkd6u/projects/id-gui/dljc-out/dljc-stdout-20210601-131700-kGG
dljc could not run the build successfully.
Check the log files in /cygdrive/c/Users/tkd6u/projects/id-gui/dljc-out/ for diagnostics.
Exiting wpi.sh.

(2)

$  /cygdrive/c/Users/tkd6u/projects/id-gui/checker-framework-3.13.0/checker/bin/.do-like-javac/dljc -t wpi --jdkVersion 11 '--checker' 'guieffect' -- mvn compile -Djava.home=target/downloads/zulu11.41.24-sa-jdk11.0.8-win_i686
Traceback (most recent call last):
  File "/cygdrive/c/Users/tkd6u/projects/id-gui/checker-framework-3.13.0/checker/bin/.do-like-javac/dljc", line 5, in <module>
    do_like_javac.run()
  File "/cygdrive/c/develop/projects/id-gui/checker-framework-3.13.0/checker/bin/.do-like-javac/do_like_javac/command.py", line 20, in main
    result = cache.retrieve(cmd, args, capturer)
  File "/cygdrive/c/develop/projects/id-gui/checker-framework-3.13.0/checker/bin/.do-like-javac/do_like_javac/cache.py", line 11, in retrieve
    result = capturer.gen_instance(cmd, args).capture()
  File "/cygdrive/c/develop/projects/id-gui/checker-framework-3.13.0/checker/bin/.do-like-javac/do_like_javac/capture/generic.py", line 68, in capture
    if result['return_code'] != 0:
KeyError: 'return_code'
kelloggm commented 3 years ago

Hi @joachim-durchholz-six, I've been working on wpi.sh and dljc for the past couple of weeks to try to make them more reliable, and some of the problems you encountered were caused by my work being in an incomplete state. Sorry for the inconvenience.

The .plume-scripts subdirectory is missing.

Can you elaborate on this problem a bit more? wpi.sh should create this directory automatically if it doesn't already exist.

It requires Python 3 now, not Python 2.7 as documented.

This is my fault - I switched DLJC to python3 in https://github.com/kelloggm/do-like-javac/pull/22, which was merged 5 days ago, and forgot to update the documentation in the CF itself. I'll open a PR to fix this. Sorry about that!

It fails with a .mvnw-based build, at least if there's no mvn command installed. I'm not sure what exactly happens, it seems that wpi.sh detects the .mvnw but dljc does not accept that, or vice versa.

I'll look into this. wpi.sh is supposed to handle builds with mvnw, but I don't think we have tests for that (yet).

wpi.sh does not emit an error message if the build command uses ./.mvnw. All you get is the usage information. This seems to apply to wrong parameter values in general (I ran out of time diagnosing it).

I've been trying to improve this, but yes, that's the general failure case. Sorry it's annoying - I agree!

After aliasing mvn to ./.mvnw, it worked but I got the last traceback logged below.

This traceback is concerning. If the target program is open-source, I can try debugging this myself if you provide a link to it.

It seems like WPI is in a state of bitrot, and not currently usable; I tried to make it work but ran out of time, and cannot even do the analysis required to make a repeatable error report.

Thank you for the effort you did put in - it's more than I would have expected. As for WPI's current state: WPI is still in a bit of a state of bitrot. We've been trying to improve it slowly for the past year or so (the wpi.sh script exists as a replacement for infer-and-annotate.sh, which I noticed that you also tried and ran into problems with, because of this effort). But, as you've seen, it's still finicky. Thanks for your help in helping us to improve it.

joachim-durchholz-six commented 3 years ago

It's very welcome to see people tending to the reports :-)

Maybe I was complaining too early about .plume-scripts. I had trouble initially, so I started reading shell scripts, maybe I was too proactive. What about creating a .plume-scripts.created-by-tool file that explains the status of that directory, so people have a chance to catch on? The file content could contain a reference to the code that creates the directory so people can pick up backgrounds; or maybe just a blanket "This file is created by a script, do a grep -r plume-script . in the project's root directory to find out which", so you don't have to remember to update the file contents whenever you refactor your code ;-)

The program is unfortunately of the very, very proprietary, trade-secret type, so I can't give anything out, unfortunately. It's also around 180 kLoC, so I'll have a hard time providing a good example. One thing you can do though is to catch exceptions and wrap them. E.g. what I typically do in Java (I'm aware it's Python, but my Python is too rusty to just write down an example):

public void do_something_with_file(String filename) {
    try {
        ....
    } catch (Exception ex) {
        throw new RuntimeException("Error with file: " + filename, ex);
    }

Whenever I see a log with too little context in the exception stack trace, I look at the functions in the stack, and whenever one of these functions has context I wish I'd see in the stack trace, I apply the above try-catch block. (In Python, it would even be possible to throw the same type of exception as the one you're catching. It's an extra hoop to jump through even in Python, so you may decide just throwing Python's Exception may be just fine for your scripts.)

Just FYI: Right now, I'm spending roughly 2 weeks of company time budget on making CF work for that 180 kLoC application. Afterwards, I'll have roughly 3 months in which I can use only private time, then (maybe) 1-2 more weeks to continue on CF. Just so you don't get caught off-guard when I suddenly can't help much anymore ;-)

kelloggm commented 3 years ago

Maybe I was complaining too early about .plume-scripts. I had trouble initially, so I started reading shell scripts, maybe I was too proactive.

No worries. The .plume-scripts directory is created by the getPlumeScripts gradle task, which wpi.sh calls. Several parts of the Checker Framework's build scripts depend on these scripts, so everything goes through that task. If you encountered a situtation where one of those scripts was called by wpi.sh but that directory hadn't yet been created, I'd want to know what the error message was/where in the script the failure occurred (because that gradle task should already have been invoked, but it must not have been).

The program is unfortunately of the very, very proprietary, trade-secret type, so I can't give anything out, unfortunately.

That's too bad, but not too surprising :) we'll try to debug with what you can share in the open.

One thing you can do though is to catch exceptions and wrap them

Agreed. Part of the issue with the python code is that it hasn't been field-tested the way that the Checker Framework itself has, so running on a new program commonly leads us to a new place where there ought to be error-handling code.

Just FYI: Right now, I'm spending roughly 2 weeks of company time budget on making CF work for that 180 kLoC application.

Given the status of WPI, I'd recommend annotating this program by hand rather than trying to run inference. In my experience, WPI is most useful when you have many programs that you need to annotate quickly, but it's okay if some of them are just thrown away. For example, I've found it very useful when analyzing a lot of open-source software to see how a new checker performs, or when a checker is being used as a replacement for an existing unsound heuristic static analysis on a large set of code bases at a company. When I need to analyze a single project, I always prefer to write the annotations by hand.

kelloggm commented 3 years ago

As an update on this, I've investigated and fixed one of the issues described here-in: