microsoft / debug-adapter-protocol

Defines a common protocol for debug adapters.
https://microsoft.github.io/debug-adapter-protocol/
Other
1.43k stars 131 forks source link

Should `memoryReference` reuse be explicitly suggested? #238

Open connor4312 opened 2 years ago

connor4312 commented 2 years ago

In VS Code today, when someone inspects memory, we open the hex editor and never automatically close it. The first question is, should be memory view be automatically closed at any point? This would definitely make sense when the session ends -- it still has the data it's currently displaying, of course, but interacting with it or scrolling around and requesting data outside the loaded bounds would not work.

The next natural question is whether the memory should be closed when the debugger is resumed. At the moment, variable references are assumed tied to the current stackframe and invalid after that point. This is fine, since variables are only displayed when the debugger is paused and the view showing all variables in the editor is renewed on the next pause. However, the memory view displays a single variable, and it could be desirable for the user to keep it open as a "watch" view.

One way to make this work would be to suggest debug adapters reuse memoryReferences between pause events, and suggest that, if an editor is displaying an view corresponding to a previously-seen memoryReference, it should treat any later-seen equal memoryReference and referencing the same memory region. Editors can then have similar behavior to their "variables" view, where the binary contents are shown as available when paused and a reference is present, and unavailable when the program is resumed.

ishche commented 2 years ago

Would it make more sense not to close the view but to indicate that values are out of date?

connor4312 commented 2 years ago

In the VS Code implementation, we load in variable memory gradually and don't cache it outside the editor. If the user scrolls outside the loaded bounds, or moves the editor to the background and back to the foreground, it would fail to load unless there's a valid memoryReference still available. So just having an indicator isn't sufficient to make it a 'good' experience, though the latter point could be solved with some caching...

However, users are still apt to want to keep the view open as a "watch" view and not have to re-opened it each time the debugger pauses, which requires that we have a way to correlate what memory is loaded.

haneefdm commented 2 years ago

I think memory references should stay the way they are in the DAP. If we step back and look at how a memory view comes into existence.

  1. By using something in the variable/watch windows (good feature)
  2. User asking for a memory view at a certain address/expression.

The second one is the more common way as some of those addresses may never appear in a Variable window. Then, they have to add such an expression into the Watch window first and then create a memory view from there. They may have no interest in having that expression in the Watch Window.

For a memory view (like ours for example) we assume all views are persistent. This is through multiple debug sessions and VSCode re-starts. Similar to a watch window. Every time a debug session starts, we try to see if an orphaned window can be adopted by a new session. We are also aware of multiple debug sessions and some views are only associated with certain sessions types (session name and wsFolder have to match).

It is a lot of work for people to recreate the memory views. If they created one, chances are they want it until they decide they don't want it.

If an expression eval fails, we notify the user and still show the stale contents from previous sessions -- with a visual indication, that it is stale. Expressions are great because you can look at packet[index] and it can evaluate to some new address with every pause.

At any given time, we really don't have that much data per memory view. We estimate it to be less than 10KB or far less. It is not like people will create hundreds of these views.

All of this is based on what one can see in Eclipse, Visual Studio, various commercial embedded debuggers. Memory views should be persistent.

haneefdm commented 2 years ago

we load in variable memory gradually and don't cache it outside the editor

So do we. The difference may be master is always the model in the main extension. Not in the webview. I am thinking you do the same. In the main extension though, we never throw away anything. The webview throws away everything because it knows it can always get it from the model. Just like HexEdit, we also generated very few rows of DOM so the DOM is always reasonably small.

Currently, I artificially limited the view to have no more than 1MB of data. If they are looking at that much, there is something wrong and they should be using another view. I may remove that limit but I will wait for user feedback and look at usage scenarios. We can also persist just the pages most recently seen and discard the rest. An optimization I feel is not needed right now.

haneefdm commented 2 years ago

I think I will answer your question in OP one by one

In VS Code today, when someone inspects memory, we open the hex editor and never automatically close it. The first question is, should be memory view be automatically closed at any point? This would definitely make sense when the session ends -- it still has the data it's currently displaying, of course, but interacting with it or scrolling around and requesting data outside the loaded bounds would not work.

No, leave the views open. Users should close when not needed. They may still be looking at it

The next natural question is whether the memory should be closed when the debugger is resumed. At the moment, variable references are assumed tied to the current stackframe and invalid after that point. This is fine, since variables are only displayed when the debugger is paused and the view showing all variables in the editor is renewed on the next pause. However, the memory view displays a single variable, and it could be desirable for the user to keep it open as a "watch" view.

No, they should not be closed. Do what every other IDE that shows memory does. Try to rebind the old memory views to the new session. This is expected of any debugger-related memory viewer. Note that the next session that starts may be unrelated -- and I mitigate that.

One way to make this work would be to suggest debug adapters reuse memory references between pause events, and suggest that, if an editor is displaying an view corresponding to a previously-seen memoryReference, it should treat any later-seen equal memoryReference and referencing the same memory region.

Sure. Kinda. But, debug adapters don't keep track of the thousands of memoryReferences they dole out. They just dole them out when a client does a variable/watch/stack/etc. request. They are canonical for that point in time. For the same variable next time that may generate a different memoryReference. But the old one is not invalid. The question is what does the user want to see? The old memory or the memory pointed to by the current state of the variable. Only the user knows that and my memory viewer gives them that option. Not a DAP conversation but I also want VSCode to provide to memory viewers both memoryReference and the expression/variable that created it.

The deal was that memoryReferences are only understood by the Debug Adapter that created it. They are opaque objects to everyone else. Sometimes within the same session and for sure between sessions. I was in that conversation when memoryReferences were born in the DAP. I have been simply using a pointer (whatever debuggers like gdb gave me). If a debug adapter is using simple addresses then sure they are likely to mean the same memory region in the next session, but not guaranteed to be valid as the program being debugged can change dramatically.

Editors can then have similar behavior to their "variables" view, where the binary contents are shown as available when paused and a reference is present, and unavailable when the program is resumed.

No, this is different. We should not enforce anything on the editors. Not DAPs job to tell people what to do with the data they get. If a certain editor wants to clear content, fine. But mine will not, but make it obvious the contents are stale and disable editing. Most users are accustomed to this behavior for memory and what VSCode does for Variables and Watch windows. They behave a bit differently.

I would say that a memory view should act more like the Watch Panel. This is what users expect with the option that you can also watch constant pointers which the Watch Panel gladly accepts and I use them. I do something like *(unit8_t *)0xe0001f040 because I don't have a variable pointing there but I know what I want. Watch Panel is fine with that. It asks the DA to evaluate that until you delete that Watch expression.

Side note:

It is interesting. I keep saying 'viewer' and some people say 'editor'. Viewing memory is the primary intent, while it may allow editing sometimes. Just like the Variable and Watch Panels. Their 99% use case is viewing.

connor4312 commented 1 year ago

Thanks for the perspective, that's helpful.

In addition to your native use case, managed languages have different usages of memoryReferences. Where a user can view memory of an object, in, say, JavaScript, it's some interpretation of the data in a standard variable. In this case, the lifetime of the memory is tied to the variable itself, which as recently clarified is only valid until the debugger is resumed (and this is also a restriction in many runtimes; you can't keep using a stack variable after that stack disappears).

One thought to address this is a new lifetime parameter in ReadMemoryResponse, something like:

interface ReadMemoryResponse {
  address: string;
  unreadableBytes: number;
  data: string;

  /**
   * How long the memoryReference in the request is valid for. It may be one of:
   * 
   * - 'stopped': the reference is valid until the debugger is next resumed
   * - 'session': the reference may be read again at any future point
   * 
   * If not specified 'stopped' is assumed.
   */
  lifetime?: 'stopped' | 'session';
}

This does leave a gap for 'memory which can be re-viewed in the future, but only when the runtime is stopped.'

I think we also need some known way to say "this session memory is no longer valid", such as on deallocation or if the client tries to use it in a new session where it does not apply. Perhaps a new predefined Response.message. Maybe having such an error is enough that we don't need to specify the lifetime, and UI simply assumes the memory is no good once an error is encountered.

What do you think?

haneefdm commented 1 year ago

@connor4312 I have to think about this a bit more, but.... My thinking (out loud) is that the lifetime of a memoryReference is always 'stopped'. But the expression/variable that generated the memoryReference has a lifetime (global variable vs local variables) -- but not always. Watch expressions are always 'stopped'. So, lifetime is not an attribute of a readMemory request/response, it is a potential attribute of a memoryReference

If it makes sense for a managed use case, then fine by me. I can't find a way of use it for native cases

Practical usage:

While technically memoryReferences are opaque, most of the time a memoryReference is simply an (encoded) constant so it may always work even across sessions. We recommend in our memory viewers that people convert the memoryReference to the source expression to get a better interpret the results and be more predictable. And we handle the memory view by

This is why we always assume the lifetime is permanent -- even beyond the session to the next session. Kind of like how we deal with watch variables. Just that sometimes, the memoryReference/expression/watch-expression is invalid when the debugger stops and that is okay. Users can see that visually and are fine with it as well.

Our memory viewers work almost identically to how the WATCH window works.

connor4312 commented 1 year ago

So, lifetime is not an attribute of a readMemory request/response, it is a potential attribute of a memoryReference

Yea, I was debating between this and having another optional property everywhere memoryReference is returned like memoryReferenceLifetime, to the same effect. If a memory viewer is open, the memory reference would have been read and its lifetime acquired. But it is a little awkward either way.

My thinking (out loud) is that the lifetime of a memoryReference is always 'stopped'... While technically memoryReferences are opaque, most of the time a memoryReference is simply an (encoded) constant so it may always work even across sessions

Did you mean that a memoryReference is always session?

This is why we always assume the lifetime is permanent -- even beyond the session to the next session

DAP is still a toddler and doesn't have a protocol notion of object permanence between sessions. It was my thought that clients may keep 'session' references around and try them on the next session to see if they get an error or not.

haneefdm commented 1 year ago

Did you mean that a memoryReference is always session?

I wear two hats.

DA Author: I did mean a stopped. I cannot guarantee that the lifetime is beyond stopped. Really depends on what the memoryReference was derived from. Even though there is a certain amount of determinism and certain items are session, we do not have the logic right now to determine and track them.

DA Client: While I know that the lifetime is not guaranteed for more than stopped, more often than not, I think of it as permanent and it is okay for it go invalid sometimes or even forever. While I am the author of just one DA, I am a client of multiple DAs.

We also use memoryReferences for stackTraces (call stack) which in turn is used by the client for disassembly. Over there, they are generally good for the session. Disassembly results in breakpoints that could be set on certain addresses which are again memoryReferences. But if that breakpoint goes invalid in a future session, it is okay, it may become approximate.

DAP is still a toddler and doesn't have a protocol notion of object permanence between sessions. It was my thought that clients may keep 'session' references around and try them on the next session to see if they get an error or not.

Yes, I do that as a client. It is a bit more complicated as session IDs are transient but we manage to figure out based on other attributes if a new session could be a re-incarnations of a previous one :-) This logic is not perfect but works 99% of the time.

Yazwh0 commented 6 months ago

To be able to do this, there would have to be a new function call added to DAP, to get the SourceReference from a Source object from a previous session. The SourceReference can then be 're-linked' to the active windows, and debugging can proceed?

Yazwh0 commented 6 months ago

As an aside, breakpoints survive from one debugging session to another, so wouldn't other references that use the SourceReference also be able to somehow?

puremourning commented 6 months ago

Breakpoints in DAP do not survive across sessions. They are recreated by the UI on every new debug session. Clients don't have to do this to be compatible with DAP.

Yazwh0 commented 6 months ago

Using VSCode, if a debugger creates a Source which is only available during the debugging (So SourceReference is set) the breakpoints on that code are resent, after Loaded Source message.

Would a similar event for memory work?

puremourning commented 6 months ago

A memorybreference (or even an actual address) is ephemeral. Even if it represents a real address there are limited circumstances when the same address would even have value in another session (or for that matter, even be mapped)

A source (actually, its file path and line number) is persistent and makes sense across re-starts.

UIs which acquire their memoryRefeeence from an expression in the mem window can always persist the expression and reevaluate it on every stopped event. No DAP change required.

Yazwh0 commented 6 months ago

Thats not true.

The debugger I'm working on for example is for a 65c02 based machine. Memory addresses there do not change between sessions! The CPU cannot relocate code! The same can be said for other 8bit, 16bit, probably even some 32bit machines and one would image all sorts of modern embedded systems.

For my project a memory view is essential to understand the machine. So having to re-open it each session and CTRL-G to the memory location is not a good user experience.

puremourning commented 6 months ago

You mentioned one of the limited circumstances.

But given what I said about expressions, what changes? An expression casted to a pointer can often be used as the source of an (ephemeral) memory reference for that (pointer) value.

Yazwh0 commented 6 months ago

It also doesn't change the fact that having windows based on a memory reference become inactive between debugging sessions a terrible user experience. The window itself doesn't show that has become 'detached'.