K semantic REPL - Githubissues

Baltoli commented 1 year ago

Current Situation

The debugging tools available for K projects are limited. Currently, the following options are available to developers of K-based projects:

GDB integration with the LLVM backend:
- + Allows for fine-grained introspection of what the generated code is doing while a program executes.
- + Supports C++ hooks.
- - Prone to breaking when package versions change
- - Sensitive to compiler options
- - Doesn't support macOS
The KORE repl:
- + Allows proof state to be worked through interactively
- - Specific to proofs and the Haskell backend
Ad-hoc tracing & printing:
- + Flexible, easy to use.
- - Requires additional imports & recompilation
- - Inconsistent backend support

Proposal

We would like to implement a REPL (or a toolkit for building a REPL) for languages implemented in K. Doing so would allow for developers to step through their code at the language (or semantics) level.

Requirements / Features

Initial dump of requirements per @sskeirik:

Ideally for a first cut we have a semantics level repl that lets us step through program execution as well as a program level repl (skips internal semantics rules).
A next step would be to make it easy to visualize only the interesting parts of the semantics state (EDIT: esp. useful for program-level repl which may not need to visualize the entire semantics state but also useful though for semantics-level debugging of particular semantics features).
A third step would be to make it easy to implement "break on line number" kind of functionality by letting the repl understand some generic location information: either semantics rule location or interpreter program location, depending on whether you are debugging the semantics or a program in the semantics. (EDIT: For the AlgoClarity project, break on program location functionality is what the client wants us from us for the next engagement). (edited)

ehildenb commented 1 year ago

Let's jsut get the first version of k-repl implemented and merged which allows single K steps for now, then we can figure otu the rest.

ehildenb commented 1 year ago

More features it should have:

From a current configuration, try applying some specific rule.
Pretty-printing modules able to be plugged into k-repl for displaying state (minimize which cells are displayed, omitting some cells, etc...)

ehildenb commented 1 year ago

We should also store the nodes (or allow storing the nodes) in a pyk.kcfg.KCFG.

Maybe we add a command store NODE_NAME which allows storing the current node in the KCFG, and defines an alias NODE_NAME for it?

Then the user can use show-cfg to display a textual CFG, and can select another node to jump to with select NODE_NAME.

dwightguth commented 1 year ago

Some questions I want to answer. Some are probably very easy, others will take more time.

Are we set on python as the language to build this in?
If the idea is to interactively enter commands, what commands should we target to start with?
How can we engineer the code for handling commands so that it's extensible to new commands, both user supplied and developer written?
Do we want a command language and if so, what should its features be?
Do we want auto complete and other line editing support?
For each command we are planning initially, what should the parameters and output of the command be?
Do the tools already exist to implement each command? If not, what is missing?
How can we support scripting and composition of commands?
How does potentially executing multiple branches in parallel interact with the interactivity of entering commands?

I would feel better about getting started on implementing this if we had answers to these questions written down somewhere in a structured manner, whether on Google or in a single issue body

ehildenb commented 1 year ago

The kit REPL that we used for the summarizer had these commands: https://github.com/runtimeverification/ksummarize/blob/master/kit-shell

We also have the commands the kore-repl uses. We should probably pick (roughly) a subset of the intersection of the commands of both.

I would propose:

show-cfg: Display an ascii rendering of the current exploration.
step [n] [NODE_ID]: single K step (symbolic or concrete, depending if there are variables in the configuration).
show|print [NODE_ID]: Display the current configuration, by default as pretty K, and as minimized as possible (with ... wherever possible). Add options for showing specific cells.
case-split CONDITION [NODE_ID]: From a given node, add the supplied constraint, and also add the node with the negation of the supplied constraint, to the graph.
check-implication NODE_ID_1 NODE_ID_2: Check if NODE_ID_1 is subsumed in NODE_ID_2, and give back a substitution and contraint that witnesses the subsumption if so.

I recommend these commands because they are what has been useful for the kit-shell for exploring proofs. From the kore-repl, the various things people have used are (i) stepping, (ii) selecting nodes, and (iii) displaying the current explored graph structure, (iv) displaying nodes. All of these functionalities are covered by the above commands.

I recommend pyk for implementation, because it already implements all of these functionalities, and speaks both KAST and Kore, and has direct and fast communication with the backends. It also already has the KCFG datastructure for storing execution graphs and manipulating them, and it has the CTerm abstraction, which defines the operations that must be done quickly over states in order to not cause delay on the Python side. We can make sure that the various manipulations exposed by CTerm (such as CTerm.add_constraint) have fast and direct operations in the backends, so that we can do our executions and manipulations as much as possible purely in the backends.

sskeirik commented 1 year ago

@ehildenb's answer seems to be focused on symbolic execution; concrete debuggers typically have different priorities and I think it is worth highlighting those.

If the idea is to interactively enter commands, what commands should we target to start with?

I think the most important commands are (in order of most important to least important):

show [id] - pretty-print configuration of state id (defaults to current configuration)
step [step-count] - executes step-count steps
break <location> - execute semantic steps until location is reached (could be either an input program location or a K rule location --- depending on which one is being debugged)

(Click-to-expand) The commands in this drop-down are all non-essential, nice-to-haves

4. `conditional-break [location] ` - execute semantic steps until predicate holds at location (location defaults to anywhere) 5. `rewind [step-count]` - rewinds `step-count` steps (defaults to 1) 6. `goto ` - sets current configuration to the one with identifier `id` 7. `update

` - updates the configuration to one identical to the previous configuration but with `cell-name` now with value `cell-value`

How can we engineer the code for handling commands so that it's extensible to new commands, both user supplied and developer written?

How can we support scripting and composition of commands?

I think that this requires developing the correct primitive operations that can be easily composed to produce higher level operations. There are couple of primitives that would be really useful to have:

(a) given a configuration, pretty-print it (ideally can apply filter to display partial configurations) - needed for show
(b) given a configuration, perform n steps of semantics evaluation and return a new configuration - needed for step
(c) given a configuration, check if a #location K term corresponding to a particular line appears in a particular cell - needed for break on input program AST location
(d) given the current debug state, return the last applied semantics rule - needed for break on K rule

(Click-to-expand) Implementation details for non-essential commands

- (e) serializing a configuration to disk and deserializing it - useful for `rewind`/`goto` - (f) given a map of configuration variables, generate the corresponding initial configuration (or) given a map of cell values, generate the corresponding configuration - needed for `update` plus general programmatic use of debugger - (g) given a configuration and a generic predicate over the configuration, evaluate whether the predicate holds - needed for `conditional-break` Of the items above, implementing the primitives needed for generic `conditional-break` is the hardest. I can think of two general strategies: - (i) develop a generic _functional_ K interpreter (the subset of K with just _function_ rules) that can: read the functional part of the language spec, dynamically load new functions (over specific sorts, cells, or the entire configuration), and dynamically execute those functions - (ii) a rewrite of (parts of) the K stdlib in the same language the debugger is written in (probably Python) so that we can write predicates in debugger lang to execute over a K configuration as represented in the debugger While I don't think a generic `conditional-break` should be a short-term priority, I think this is a goal worth pursuing in the long-term.

How does potentially executing multiple branches in parallel interact with the interactivity of entering commands?

It seems like you either:

block until operations have completed and then return a prompt to the user
have a dynamic prompt (perhaps it indicates how many branches are pending) and allow for operations to work in a partial state, e.g., Ctrl+C to kill pending branches, a show command that displays all currently executed branches or possibly a snapshot of the state of currently executing branches

dwightguth commented 1 year ago

While all of these commands are potentially valuable, I want to remind everyone that what I'm trying to get us to agree on is a basic initial version of the tool. Please moderate your expectations because it's inevitable not everything from these lists will make it into the first version prototype.

sskeirik commented 1 year ago

To try and make priority for a concrete debugger more clear, I have added click-to-expand sections to hide non-essential commands and their implementation details to my previous comment. I also added two other primitives to my previous comment that are needed for break.

tothtamas28 commented 1 year ago

We would like to implement a REPL (or a toolkit for building a REPL) for languages implemented in K.

I think the focus should be on the second option. So the backends would expose a common debugging interface (with additional custom features that the particular backend supports) for which we can implement a client in Python. The simplest K REPL is then a small script that starts and initializes a Python interpreter.

In this setting, the following questions by @dwightguth would (to some extent) be addressed:

Are we set on python as the language to build this in?

For the client code, yes. The server interface has to be implemented by the backends.

Do we want a command language and if so, what should its features be? How can we support scripting and composition of commands? How can we engineer the code for handling commands so that it's extensible to new commands, both user supplied and developer written?

The scripting language is Python, with all the power (and complexity) it comes with.

Do we want auto complete and other line editing support?

The Python interpreter provides this out of the box.

Do the tools already exist to implement each command? If not, what is missing?

For the client, most of what we need for an MVP is already implemented in pyk.

How does potentially executing multiple branches in parallel interact with the interactivity of entering commands?

We can use asyncio or some other Python library to handle async calls in the client.

The following questions regarding interface design would still be open:

If the idea is to interactively enter commands, what commands should we target to start with? For each command we are planning initially, what should the parameters and output of the command be?

ehildenb commented 1 year ago

Commands:

load PROGRAM (parse and load a program, initialize it as the first term)
step [N] (take K steps)
show [ID] (display a given configuration)
- kit aliases, uses hashes to identify configurations
- add-alias [alias-name|ID] etc. if we do this; default aliases for things like case splitting
- Minimise configurations - use Pyk to insert ... smartly by collapsing _Unused
- Symbolic has an easy heuristic for how to minimise, concrete perhaps state deltas?
show-diff ID1 ID2 - state delta w/ sensible default arguments (current, last printed)
show-cell [...cells]
select ID
step-to-branch
case-split ID Bool
check-implication
- build derived / automated features like run on top of this core abstraction
- kprove, search etc. are just strategies / scripts of the above commands.
redirect syntax command > file
show-cfg - visual, dot etc implemented by Pyk already
break
- Implement in terms of step-to-branch
- Note ~~possible~~ big inefficiency here - single-stepping
- Rule label OR AST location - ~~latter requires introspection of the K cell to look for #location~~
- We have the tools to implement, even if it's slow it's worth doing as a first cut
- Location-based breakpoints can be built at the semantic level as rules that trigger a branch, then look at a cell for location.
- Not dynamic, but would be faster at run-time
- Possible optimisation is to edit the configuration to add a breakpoint with Pyk, then send it back to the backend
- Custom printing - "show me a cell" "show me a minimal config with ..." - send to Pyk and use that implementation
- Slow, default impl: backend -> kore -> kast -> printed
- Arguments to show
Dynamic configuration update (e.g. change the value of a variable)
- Location breakpoint is basically this specialised for one particular cell w/ different UI

Notes:

Spend time using these tools by hand first to figure out how best we want to compose them. For example, going through the kprove cycle

Baltoli commented 1 year ago

https://github.com/runtimeverification/llvm-backend/blob/pybind/bin/k-repl

Note that the API as installed is slightly different to this demo as well.

Baltoli commented 1 year ago

Steps to take:

@dwightguth investigate REPL libraries, code structure from demo
@tothtamas28 pull in dependencies properly - RPC server + python bindings

ehildenb commented 1 year ago

Notes:

Working on data-layer for storing database of terms, and eventually KCFGs. Looking into MongoDB for this currently. Want data-layer separated from representation layer.
Separate repo for server infrastructure? Makes sense because adding in dependencies like MongoDB to pyk is too heavy. Will update cookiecutter python project template, and then make new repo.

Remember our users of k-repl:

Semantics level debugger (step, step, etc...)
Collaborative proving.
Visual interface which needs to retrieve data from the database, display it.
IDEs: set breakpoints in VSCode and step to those points, display some language specific thing.

Some example use-cases:

Alice sets up a proof on her machine, which spans 9 smart contracts. She begins working on contract C, and verifying the correctness there. Meanwhile, Bob begins on contract D, which calls contract C. Once Bob reaches the point in symbolic execution that C is being called, he wants to use the basic blocks already discovered by Alice (or the full verification results) to do the execution, instead of re-executing that code. They both should be able to store their progress/partial results in the same kserver instance, behind authentication, so they can both incrementally make progress on the proofs, and the server should eventually take care of recognizing when some cached proofs are reusable. For now, let Alice and Bob specific that relationship to the server as "proof D uses proof C, so keep the learned lemmas and axioms of C in context when working on D".
Charlie is writing a program in the IMP programming language, and something is not going right. He wants to set a breakpoint in VSCode debugger, and inspect the program state at that point. Charlie doesn't know about K or configurations or cells, he wants the information displayed in an IMP-intuitive way directly where VSCode normally displays debug information.
Darlene is checking Alice and Charlie's work on the verification project. She wants an overview of which proofs have been completed, and to be able to visualize Solidity-level information about each proof (step through it interactively in browser). She does understand how to interpret K results, and is happy filtering the output herself in browser by controlling which cells she sees. She examines a proof E which has not been started, and suddenly has a realization about a lemma that will immediately go through! She supplies the lemma in browser, and pushes a button, then walks away to get dinner. The next day, she wants to visualize the results of the proof, and when she pushes the button again to run it again, she doesn't want to have to re-enter the same lemma.

runtimeverification / k

K semantic REPL #2925

Current Situation

Proposal

Requirements / Features