Discussion: Interactive debugger interface

firelizzard18 commented 3 years ago

Proposal

Discuss introducing a mechanism for interactively debugging Yaegi scripts. For example, some script.go, I want some interface parallel to interp.EvalPath("script.go") that provides a way to step through the Go statements in script.go.

I am happy to develop the debugger myself, as part of this repo, or another traefik project, or a completely separate project. I would do this as a Debug Adapter Protocol implementation.

Background

I created the Go Notebook Kernel extension for Visual Studio Code. I would like to add the capability to debug Go notebook cells. Since the notebook kernel is driven by Yaegi, this would require interfaces for debugging Interpreter.Eval, so that I could implement a debug adapter for it. I have previously developed the VSCode Byebug Debugger, which relies on my implementation of the debug adapter protocol for Byebug.

Workarounds

N/A. Other than debugging Yaegi itself, there's no way to debug a Yaegi script.

mvertes commented 3 years ago

Thank you for your proposal @firelizzard18, it's a nice one. I think that a debugger can be integrated part of this repository and would be quite useful in REPL, or after a panic before abort, or even upon a signal in embedded mode. At the minimum, the yaegi core should be extended to set breakpoints, to go in step mode and to inspect scopes, and it would draw a clean boundary if necessary.

As long as no external dependencies (outside of standard library) are added, everything could be done directly in this repository. I havent looked at DAP yet, and do not know if it is reasonable to implement it this way, or have it in a separate project. It would be very nice if you could go further on that. I can also provide assistance for the low level part (breakpoints, steps, etc).

Thank you again for tackling this.

firelizzard18 commented 3 years ago

DAP is a set of types and a protocol for sending typed messages between a front-end (e.g. VSCode) and a back-end (aka adapter). There are a variety of ways to implement the back-end; this is what I would do:

internal/dap
- (Un)marshalling to/from the wire protocol (essentially JSON + headers)
- Go types mirroring the specification, made with a generator
internal/yaegi_dap
- The actual debug adapter
- This is mostly a translation layer in between DAP and the interpreter's debug interface
An additional subcommand of cmd/yaegi or a new command such as cmd/yaegi-dap
- Responsible for creating the communication channel between the front-end and back-end (unix/tcp/etc)

In and of itself, a DAP back-end does not constitute a debugger. Additionally, to use VSCode as a front-end, we would need to write an extension in TypeScript and/or JavaScript with a bit of glue code. The one I wrote for Byebug DAP (here) is about 250 LOC, and most of that is dealing with launching the executable. If the adapter supports named/unix pipes or TCP sockets, the extension implementation is very simple. I am happy to do that as another traefik project, or as my own project.

firelizzard18 commented 3 years ago

Here's an implementation of DAP en/decoding: https://github.com/traefik/yaegi/compare/master...firelizzard18:dap.

It's a lot of code because:

I built a tool that can produce types from a JSON schema. The generator uses itself to produce types for the JSON schema.
JSON schemas are meant for validation and for TypeScript, so there were some issues. I solved this by making a system to patch the schema after downloading it.
To patch the schemas, I wrote code that understands standard JSON patch files and can apply them.
Handling DAP involves some rather large case statements. Other than using a map instead of a case statement, there's no way around that if we want to use strongly typed values (and not just interface{}/map[string]interface{}/etc).
The DAP types themselves take ~900 LOC. I tuned to the generator to make the types easier to use, and to omit useless types.

I chose to use code generation because the DAP spec is large and changes relatively often (types and fields are added), so maintenance would be somewhat onerous without a generator.

By itself, this doesn't do anything except encode and decode DAP messages. Now that I have DAP types generated, I'll work on the actual debugger.

firelizzard18 commented 3 years ago

I implemented a very basic debug adapter on my branch https://github.com/traefik/yaegi/compare/master...firelizzard18:dap. The client side is a VSCode extension.

This minimal implementation can launch a Yaegi interpreter instance and capture stdout/stderr. Anything more will require changes to the interpreter. To test it:

Clone the VSCode extension
- Open it in VSCode
- Run yarn install
- Launch the "Run Extension" configuration

In the [Extension Development Host] window

Create a launch configuration:

{
"name": "Launch File",
"type": "yaegi",
"request": "launch",
"mode": "auto",
"program": "${file}",
"cwd": "${fileDirname}",
"showProtocolLog": true, // prints protocol messages to an output channel
}

Open a sample.go
Execute the "Launch File" configuration

firelizzard18 commented 3 years ago

I have something working that could be called debugging:

firelizzard18 commented 3 years ago

@mvertes A couple of notes:

I see three basic options for implementing the debug adapter:

Implement it outside of the interpreter - this requires the interpreter to expose interfaces for debugging
Implement it inside the interpreter, but with a custom interface, so the DAP logic can live outside - this custom interface would need to be as complex as the subset of DAP we want to implement
Implement it completely within the interpreter - this is the simplest solution, but it means the interpreter would have DAP-specific code - is that acceptable?

I chose (1). You can see what this looks like in interp/program.go and interp/interp.go in my branch. If (3) is acceptable, that would probably be better.

As far as how to actually debug code - from my cursory understanding of the interpreter, the simplest way to add debugging seems to be adding a hook to the loop at the end of runCfg. If I understand correctly, every executed statement is executed there, so adding a hook there can intercept everything. The way I added this hook, every statement sends an event to VSCode and then waits on a channel. When the user steps/continues in VSCode, the adapter writes to the channel.

I think we'll want to add debug info to frames and/or nodes, for a couple reasons. Compiling that debug information could be turned on or off by an interpreter option, to avoid the performance impact when not debugging.

I'm not sure what the best way to get location information (what line of code is the program stopped on). I made some rather disruptive changes to run.go and ops.go (via the generator) in order to get access to the node associated with a bltn in runCfg, since the node has position info. This works well for the current frame, but it doesn't provide any info on parent frames. We could add position information as debug info to the frame, but it would need to be updated as nodes are executed to track the current line.

Other debug info we need is variable names. I'm not sure if there is a way today to recover names that correspond to slots in the frame data, but I imagine it would be messy. We could instead add dataName []string or something to the debug info.

Right now, I'm stopping on every single node (except ones with no position). The debug adapter could handle breakpoints by mapping them to nodes, then checking the current node against that map every time the hook is called. But that adds overhead to every single operation. Another way of doing it would be finding a way to insert the debugging hook into the exec chain. If we keep the change to bltn to return the current node, we could attack breakpoints to the nodes themselves. That way, "Should I break here?" becomes a check on a property of the node, instead of some kind of data structure operation (like a map lookup).

If having the current node available in runCfg is desirable, it may be cleaner to change bltn to return only the node, and then call n.exec(f) in runCfg. If I recall correctly, in something like 99% of cases, exec(f) is returning n.exec of some node. The only case I recall as being different is select statements.

mvertes commented 3 years ago

You have already done a lot. That's great.

Regarding your modifications in runCfg, I understand that the access to the*node value which corresponds to the bltn is required. I see you use it to get the corresponding source code token.Pos. But changing the signature of bltn as you have done has a significant impact on performances, even when the debugger is not active. Adding a single indirection like returning a node to pass a node.bltn instead of directly a bltn can add a few % of overhead in compute intensive code.

What we can do instead is to have a function to retrieve the node from the bltn address. The performance cost of this function is not a concern in this case because it is used only in debugging sessions, and we can keep all the compiled code unchanged. We already use a similar trick to retrieve the context in recover, see https://github.com/traefik/yaegi/blob/c7fcfa8534c39f8dee6acf7f8b46b39b94435027/interp/run.go#L127

Regarding data name resolution, the best should be to lookup the scope symbol from the frame address and index, by scanning all symbols in the corresponding scope, in the tree rooted at interp.scope (again performance here is not a concern). I have to look at it, maybe there is some missing field in the existing structs to achieve this.

firelizzard18 commented 3 years ago

What we can do instead is to have a function to retrieve the node from the bltn address.

It didn't occur to me to use reflection to get around the limitations of func values. I reverted my changes to bltn and updated runCfg to use originalExecNode.

The performance cost of this function is not a concern in this case because it is used only in debugging sessions

I disagree that performance while debugging isn't a concern. If we are looking up a node from the bltn for every single statement exec (when debugging), a costly lookup could have a significant negative impact on the usability of debugging. That being said, in the vast majority of cases, I can quickly find the node by checking against n.tnext.exec and n.fnext.exec instead of walking the tree.

If we only need the node for setting breakpoints, we could avoid that by mapping a breakpoint on a node to reflect.ValueOf(n.exec).Pointer(). I don't know if a map lookup would perform better than the quick-path exec lookup I describe above. That may depend on the number of breakpoints.

firelizzard18 commented 3 years ago

In retrospect, I'm liking option (2) best. I've implemented a relatively simple interface in interp/debug.go that handles the low-level stuff, while the protocol itself is handled by internal/dbg/adapter.go. I have most of the basic debugger functionality working. This is the public debug interface in interp/:

type Debugger struct {}
type DebugOptions struct {}
type DebugEvent struct {}
type DebugFrame struct {}
type Breakpoint struct {}
type DebugStopReason int

const (
    debugRun DebugStopReason = iota
    DebugPause
    DebugBreak
    DebugEntry
    DebugStepInto
    DebugStepOver
    DebugStepOut
    DebugTerminate
)

// Debug sets up a debug session. Call Continue to start execution.
func (interp *Interpreter) Debug(ctx context.Context, prog *Program, events func(*DebugEvent), opts *DebugOptions) *Debugger

// Wait blocks until the program has completed executing.
func (dbg *Debugger) Wait() (reflect.Value, error)

// Continue unblocks the program. The program contiues executing until it hits a
// breakpoint, Interrupt is called, or the program exits.
func (dbg *Debugger) Continue()

// Step unblocks the program. The exact behavior depends on the reason.
func (dbg *Debugger) Step(reason DebugStopReason)

// Interrupt asks the program to stop. The exact behavior depends on the reason.
func (dbg *Debugger) Interrupt(reason DebugStopReason)

func (dbg *Debugger) SetBreakpoints(path string, bp []*Breakpoint) []*Breakpoint

// Reason returns the reason for the event
func (evt *DebugEvent) Reason() DebugStopReason

// FrameDepth returns the depth of the current frame
func (evt *DebugEvent) FrameDepth() int

func (evt *DebugEvent) Frames(start, end int) []*DebugFrame

func (evt *DebugEvent) Frame(n int) *DebugFrame

func (f *DebugFrame) Name() string

func (f *DebugFrame) Position()

func (f *DebugFrame) Variables() map[string]reflect.Value

firelizzard18 commented 3 years ago

@mvertes Do you have suggestions for handling breakpoints? VSCode sends a list of requested lines for each file. My current solution is to find the first node that A) has a position, B) has an exec function, and C) the position matches a requested line. That results in some oddities. As a user if I put a breakpoint on the second line of func main() { \n println(foo()) \n }, I would expect that breakpoint to get hit prior to entering foo(). However, my current implementation attaches the breakpoint to the first node in the AST on that line that it sees, which is println. So the breakpoint is hit on the call to println, after the call to foo returns.

mvertes commented 3 years ago

@firelizzard18 , to set the breakpoint correctly, once you get the node you want, which is println, you should use node.start field (also a *node, which correspond to the entry point of the AST subtree where you are located. In this case, node.start should point to call of foo, which is what you want. if I get it.

firelizzard18 commented 3 years ago

@mvertes The basic debugging experience is 90% there. Stepping, breakpoints, and the stack trace work as expected. But I still don't know how to recover variable names. You said, "the best should be to lookup the scope symbol from the frame address and index". It is not at all clear to me what that means. Do you mean the literal address of the frame, as in uintptr(unsafe.Pointer(f))? I've looked at interp.scopes and I don't see an obvious way to identify what scope belongs to a given frame.

mvertes commented 3 years ago

@firelizzard18 recovering the right symbol from a frame entry is indeed an open issue. I'm still pondering if a function to resolve the node pointer from a frame address is possible right now, or if the frame format needs to be changed to include node references per entry. I need to think more about it.

firelizzard18 commented 3 years ago

@mvertes I just finished updating the debugger to track and handle goroutines properly. Proper handling of scopes and variables are the only remaining fundamental feature. The debugger doesn't handle evaluating expressions (e.g. in the debug console), but I think that's a reasonable limitation for a first pass. Another complication, though less important than variable names, is deciding what should be included, because frames include more than just variables:

firelizzard18 commented 3 years ago

I came up with part of a solution, which involves exhaustively scanning all the scopes for a scope where ptrOf(node.types) == ptrOf(sc.types). It works somewhat, but it definitely needs better logic for which scope belongs to which frame.

firelizzard18 commented 3 years ago

setExec is a problem for breakpoints:

CFG
1. In the generator for node A, setExec is called on node B: next := setExec(n.tnext)
2. This call to setExec sets node B's exec to func(f *frame) bltn { return B.exec(f) }
3. Thus node A's next bltn is that function
4. At a later point, node B's exec is set to its final value
Breakpoints
1. The user requests a breakpoint at file.go:123
2. This sets node B's breakpoint field
Execution
1. Node A's exec is called - exec is now the bltn created in setExec
2. The debugger loop in runCfg attempts to locate the node associated with the new exec
3. Because this exec is no longer assigned to node.exec for any node, the node corresponding node cannot be located
4. Without knowing the node associated with the current exec, the breakpoint can't be hit

firelizzard18 commented 3 years ago

Adding a breakpoint after a loop is not working for me.

i := 0
for i := i; i < 10; i++ {
    fmt.Println("in blah", i)
}
fmt.Println(i)

If I add a breakpoint on the second call, it is hit in every loop iteration. So something about loops is breaking my 'get the node for this exec' logic.

00sapo commented 2 years ago

Is it possible to use this from terminal?

firelizzard18 commented 2 years ago

@00sapo No. #1225 adds debugging capability, but it does not add an kind of UI. AFAIK the only debugging UI for Yaegi is my yaegi-vscode extension, which uses my Yaegi DAP server, yaegi-debug-adapter.

Adding built-in support for debugging with the yaegi command is feasible. If that is important to you, please open a new issue. Theoretically you could use yaegi-debug-adapter with other DAP clients, though I have not tested that. AFAIK there is no CLI DAP client.

00sapo commented 2 years ago

What I actually need for now is to start an interpreter in a certain line of code to interact with the context in that point.

firelizzard18 commented 2 years ago

If you mean you want to set a breakpoint at a line and interact with the context, with the VSCode extension you can set a breakpoint and view the context at that point. However, just about all it can do right now is step through the code and view variables and the stack trace. The VSCode extension and DAP server are open to contributions. Unfortunately, I don't expect to be able to work on them much myself until my day job slows down (which may be a while).

traefik / yaegi