KeYProject / key

KeY Theorem Prover for Deductive Java Verification
https://key-project.org
Other
42 stars 24 forks source link

Lukas tracing #3491

Open lks9 opened 2 weeks ago

lks9 commented 2 weeks ago

Related Issue

This pull request addresses #.

Intended Change

This adds functionality to check a single control-flow trace in KeY, so only one path in the symbolic execution path.

The idea is to record the control flow while running normally. And then we use the recorded path as a precondition for any proof obligation.

For experimenting and debugging method contracts and similar, this might be very useful:

Backgound

I already had a previous approach using JML assume, see https://github.com/lks9/retroACexampleKeY/

The current implementation uses new taclet rules traceIf, traceElse (and the ones without else-branch).

Plan

Type of pull request

Ensuring quality

TODO:

Additional information and contact(s)

@unp1

The contributions within this pull request are licensed under GPLv2 (only) for inclusion in KeY.

wadoon commented 2 weeks ago

Do you really need to record the trace of the Java program? I guess it would be enough to capture the state of the input parameters (and class etc...) on the method entry. Then, the control flow should be deterministic. (And then you would be on track with our data minimization paper at ISOLA'22).

lks9 commented 2 weeks ago

@wadoon Thank you very much for the literature! Yes, I think that could work, too! I have not read the paper yet but I suppose it is not that easy to capture the state of all input parameters in Java? (For binary programs, something similar is done by the RR debugger, but this debugger is very system dependent and will not work at Java source code level.)

On the other hand, the instrumentation for Java is doable (see https://github.com/ProRunVis/ProRunVis based on the JavaParser). We only need to change the instrumentation that it only records a bit-trace of if/else-decisions (for C, see https://github.com/lks9/src-tracer). At least for the KeY Symposium I will have a working demo of every step, I promise!

What I also find interesting is to look how you achieve data minimization... I think control-flow tracing could also adhere that principle. But I still need to read the paper, thanks!

lks9 commented 2 weeks ago

In the meantime, you could just start KeY with: key.ui/examples/heap/pathValidation/quicksort.key

wadoon commented 2 weeks ago

I suppose it is not that easy to capture the state of all input parameters in Java?

Relatively easy. For POJO things, you can just use serialization or do it with reflection on your own. Of course, capturing and reproducing external resources (Files, ...) might be impossible.

On the other, what you need is a list of observer functions (path conditions), which evaluate the state of expression on the spot.

For the generation of Java code from JML code, we have functionality in JJBMC upon the jmltoolkit.

If you need to rewrite JML/Java program code I would recommend to look into jmltoolkit.

mi-ki commented 2 weeks ago

I think the two approaches are somewhat complementary, and I am very much looking forward to the talk at this year's KeY Symposium! For specific information-flow/fairness use-cases, @samysweb and I thought about extracting something along the lines of path conditions from information-flow proofs in KeY and this trace idea sounds very promising to me in that respect.

lks9 commented 2 weeks ago

Thank you very much for all the feedback! To be honest, I only set-up this DRAFT-PR to run the tests...

I suppose it is not that easy to capture the state of all input parameters in Java?

Relatively easy. For POJO things, you can just use serialization or do it with reflection on your own. Of course, capturing and reproducing external resources (Files, ...) might be impossible.

Well, only that all of this is already possible with the RR debugger!

Three arguments in favor of my approach against the POJO thing with reflection:

First, I think POJO would be to much of a restriction. KeY goes beyond POJO with modular specifications and there are some other approaches (e.g. context aware trace contracts or dependent assertions) that go in the opposite direction, but also beyond POJO.

Second, even with POJO this could be nice as it allows KeY's symbolic execution to follow a single path. What could be an alternative is to capture the inputs and than do that data minimization from your paper on the inputs, if this would be possible. But this needs to be implemented somehow, your original approach works by refactoring the code instead (if I grasped it correctly).

Third, retracing with KeY (or other symbolic execution tools) using recorded input constraints could be slower, because we still need to check for each if/else branching which branch is satisfiable (although we know that only one is satisfiable, because it is deterministic). With this DRAFT-PR, you get non-branching rules for if/else. It is also easier to navigate in KeY's proof tree when there is less branching (I also observed in comparison to my previous approach which used JML assume() instead of single-branch taclet rules).

On the other, what you need is a list of observer functions (path conditions), which evaluate the state of expression on the spot.

For every single if/else, while or for we need additional code (observer functions) to record either a 0 or a 1. But we do not record what is usually called path condition in symbolic execution: Perhaps you see the 0/1 bitstring as some sort of path condition, I suppose that would be somehow correct, but I would not call it that to avoid confusion. There is no (expensive) evaluation needed during recording.

If you need to rewrite JML/Java program code I would recommend to look into jmltoolkit.

Thanks!