microsoft / debug-adapter-protocol

Defines a common protocol for debug adapters.
https://microsoft.github.io/debug-adapter-protocol/
Other
1.44k stars 131 forks source link

Proposal: Allow multiple debug sessions through a single adapter #79

Closed EricCornelson closed 2 years ago

EricCornelson commented 5 years ago

Currently, both Visual Studio and Visual Studio Code both support multiple concurrently running debug sessions, and Visual Studio Code supports hierarchical debug sessions which can represent debugging child processes or multiple "global objects" (example).

In this vein, there are some scenarios where it would be useful to be able to have multiple debug sessions running through a single debug adapter, which would allow for some more advanced orchestration between debug targets.

A particularly interesting example of this is when debugging a web page which has associated web workers and service workers. The Chrome DevTools protocol allows for the concept of a "flat session" debug mode, which allows a debugger to simultaneously debug all related pages and workers at the same time through a single connection using a "targetId" to route messages to the correct debug target. Additionally, it looks like this will soon become the default in Chrome DevTools, and the "non-flat" session mode will be deprecated (see crbug.com/991325).

In the above example, debugging a web page through a single "flat session" confers many benefits (e.g. being able to start each child target in a suspended state to allow the debugger to configure breakpoints and other set-up before the child starts running). However, running in "flat session" mode requires debugging these multiple targets through a single connection, something which, today, is hard to do using built-in functionality in the debug adapter protocol (it can be achieved by running in-process in VS Code and leveraging the DebugAdapterDescriptorFactory to share state between sessions, but that solution is clunky, and specific to the VS Code API, and therefor does not translate well to Visual Studio, for example).

Some other examples of scenarios where a "flat session" mode might be useful are:

I've created a PR which, as a first step, adds an optional "sessionId" parameter to each protocol message. If this sounds reasonable, I think it would also be nice to give adapters a way to notify the host that a new child session has been attached, as right now an adapter has to manually call VSCode or Visual Studio APIs to launch a new debug session and then trap it on entry to achieve this.

Let me know what you think!

roblourens commented 5 years ago

I like this. Does the protocol need to think about how a client knows whether to route a new request to the same adapter vs creating a new one, or is that up to the client?

EricCornelson commented 5 years ago

To start with, I think we could leave it up to the client. On the VS Code side, it's pretty easy to leverage the DebugAdapterDescriptorFactory to trap new launches and re-use the same adapter instance. On VS that's a little harder, since there's no similar built in "factory" interface, but since adapters using the PineZorro extension have the ability to handle launching themselves, it's fairly easy to just use that to route everything back through the same adapter instance (which is what we're doing right now).

If we end up finding this model useful outside of javascript debugging, it would be nice to have a way to tell the host that we want to re-use the same adapter instance in "flat session". For that we could just use the capabilities event and publish that an adapter can support flat session mode.

connor4312 commented 5 years ago

This sounds neat. I like the intention and it'd certainly make some scenarios easier as you mentioned. Stepping between the client and service is super interesting, being able to step through fetch() into an Express route and back out again would be neat. In the (longer term) future it might be useful to have a contribution model in vscode-pwa to give intelligence around frameworks like that, beyond the generic logic we have out of the box...

In implementation we may want to have a new preflight request prior to launching, for the client to confirm whether it is able to attach a session to the existing adapter. Adapters won't all support flat sessions for some time, and then there might(?) be scenarios where one adapter is unable to flatten requested sessions into its existing instance. A graceful rejection or a 'method not found' error would cause a new instance of the adapter to start.

weinand commented 5 years ago

@EricCornelson Thanks for this interesting proposal.

My first reaction: This feature sounds like an optimisation for functionality that already exists. Stepping between client and server code is daily business for VS Code and it is supported for all combinations of client and server languages (e.g. TS in browser and Java and Python in the backend servers).

With this "optimisation" a single debug adapter can be used to handle multiple sessions if these sessions use the same runtime. This makes sense for a CDP-DA but I'm not aware of many other runtimes that would want to support this.

On the other hand I see quite some VS Code restructuring work necessary to support this feature.

So before we want to commit to this effort, you must convince me that there are no alternatives. ;-)

One alternative is what the Java extension is currently doing: they are running a single DA process that serves multiple debug sessions through different socket connections. In the DA server all sessions are using the same underlying runtime and/or debug target connection.

So this approach already "Allows multiple debug sessions through a single adapter". But instead of multiplexing via a new "sessionId" attribute, communication channels of the OS are used.

Would this solve the issue you are trying to address?

int19h commented 5 years ago

The Python adapter is using this model - a single adapter managing several related concurrent debug sessions, with multiple connections to it from VSCode via sockets - for multiprocessing scenarios (although we are still working on that implementation). Having a single channel to talk to the IDE would be marginally simpler on the adapter side, but it's not really a big deal.

One related thing that would be nice to have is some way for a session to tell the client to create another session (i.e. effectively, issue a reverse "attach" request, that would result in creation of a new session with the supplied debug configuration). We're using a custom event for this, and it works... but it's not a client-agnostic solution, and requires re-implementing the client part for every IDE. Since debugging multiple related processes is a generic scenario, not really tied to any particular language or IDE, and it needs this bit for the adapter to orchestrate things on the client, it would be nice to have that supported in the core DAP protocol.

EricCornelson commented 5 years ago

@weinand We actually started our proof of concept using sockets, but the big draw for running everything through a single communication channel is remote debugging in VS. Multiple socket connections are fine for local debugging, and probably even debugging a remote workspace in VS Code, but if the user is debugging a remote process either in a VM or a server, and their workspace is local, it gets trickier.

We currently have a way through VSDebugAdapterHost and msvsmon to launch the debug adapter process on the remote machine, where the adapter can then launch the desired program and start debugging, and we communicate with the adapter process via its stdin/stdout through an SSH tunnel. But we can't open a socket connection to a debug server process on the remote machine without either manually doing some port-forwarding manually or dealing with firewall issues (that is, there's nothing built in to VS that would make this easy for us, as far as I know).

@andrewcrawley pointed out that if we wanted to go the debug server route, we could do something like launch the first adapter as a debug server, and then for each child debug session in VS launch a "proxy" process that would make the connection to the debug server and then pipe the traffic back through its stdin/stdout. While that could theoretically work, we both agreed that this was a less-than-ideal workaround, as it involves a lot more moving parts, and has more potential for problems to arise.

I'm not sure how that particular scenario translates to other adapter projects, but it was the biggest factor for us to go with the "flat session" communication model.

@int19h +1 We are doing the same thing RE: the custom "reverse attach" event on the VS side, and have had the same thoughts. I didn't add that piece to the proposal, but maybe that's a better place to start. This is where we are passing the sessionId for each new child session back to the host as well, and my thought is that we're essentially telling the host "we are already managing this child debug session, just letting you know".

I'm curious, though, if we added a "reverse attach" event to the protocol, how would hosts be expected to handle that event? Right now, what we're doing with vscode-pwa is we manually send a new "start debugging" command and trap the launch to route it through the already-running adapter. But if we consider that "flat session" is not a thing in the protocol, what should the host do if it gets one of these events? We are essentially notifying the host so it can display the session in the UI and start sending commands to it. So then the question becomes, well how does the host start sending commands to this new session? We could give it a port to connect to, but I'd assume we wouldn't want to force a specific implementation (sockets) for new child sessions, and instead leave that up to the adapter implementation. Just firing a new "start debugging" command doesn't seem to make much sense to me either, because the session is actually already running in the current adapter, right, we just want to connect to it? If we fire a new "start debugging" command, then we are still in the business of having to trap the launch and route it through our existing adapter, either in a DebugAdapterDescriptorFactory for VS Code, or some other IDE specific piece, or we are again forced in our implementation to use sockets (which is not always desirable, as in the scenario I mentioned above).

So that leads me back to having a "flat session" option, where you can tell the host, "hey, I've connected to this new child session, please represent it for me in the UI and start sending messages to it over the existing connection with this identifier". 😃

I definitely understand that it doesn't make sense to add something to the protocol if it only fits a very narrow use case (especially if there is a workaround), which is also the main reason I drafted up the proposal. I was curious to see if there was any other adapters out there that might benefit from a "flat session" mode, and it sounds like there is at least one other!

int19h commented 5 years ago

What we do right now is have the adapter send the complete debug configuration in the reverse attach request. The custom event handler in VSCode then starts a new debug session with that configuration. The adapter creates it by remembering the config it was passed in the request for the initial session, and then just changing it to "attach" (if it wasn't already), and adding a property to designate which known debuggee to attach to (by PID in our case).

On the VSCode side, we use DebugAdapterDescriptorFactory to make connections go directly to the socket, without spawning another adapter. If the primary connection was "launch", we assume that the IDE is running on the same machine, and thus those secondary connections can always use 127.0.0.1 as hostname. And for the port, we start a listener in the ephemeral port range, and then retrieve the actual port, and pass it in the reverse attach request. If it was "attach", then the original debug config already has the hostname and the port as properties, and they just get reused as is.

For remote scenarios, this means that the other machine only needs to have that one port opened. This can then be tunneled via SSH for security purposes - for now, we tell the users to do this manually, but it would certainly be nice to have SSH as a built-in DAP transport directly in VSCode, with a single shared config for all debuggers that need it.

So there are really two pieces here that require custom IDE-side code right now. One is the custom event handler to create a new session, and the other one is the adapter factory that directs the DAP connection for that to go over a socket. VSCode kinda sorta has the latter in form of "debugServer" already, but since there's no way to specify the hostname, it's only useful for local debugging, not remote. I forgot about that second part, because we use this approach for all "attach" sessions, not just secondary ones, so we implemented it outside of the multi-session scenario.

And yes, you're right - to standardize this in DAP would require it to be aware of the transport layer, since the client can't just spawn adapter to service the reverse request to start a new debug session. Which means that "debugServer" would have to be added to the spec for "launch" and "attach" requests (and extended to accommodate hostname in addition to port), and clients supporting it would be required to implement DAP-over-TCP specifically.

So I guess it depends on how the DAP team views the protocol - is it meant to be entirely transport-agnostic in all respects, or is it accidental and can change if requirements demand it? I think there would be some value in having something as common as TCP be explicitly standardized as a transport. Especially since it'd be opt-in for the clients anyway - it's more about having them interpret the connection information in the same manner if they choose to support it.

The users would likely also appreciate it, since right now adapters that support remote "attach" over TCP don't really have a single consistent format for debug configuration for that, and each has to document it separately. If it's standardized, it could all go straight into the VSCode generic debugging documentation, where users generally land first when they start looking. And eventually, this would allow for SSH tunneling to be transparently added on top.

mickaelistria commented 3 years ago

As far as I can see, the multiple session approach is documented in the overview and implemented (with custom messages) in vscode-js-debug. However, the specification doesn't specify how multiple sessions are supposed to work: how client can start a session, pass a port to use for the child session, which operations (initialize? launch?) need to be sent to the child session...? As the behavior in vscode-js-debug seems satisfactory, can the related operations and flows be specified?

gfszr commented 2 years ago

Any news regarding this issue? Adding support for multi-session debugging for DAP is rather important, and each debug adapter implemeting its own vision of multi-session support creates a lot of fragmentation.

weinand commented 2 years ago

Today multi-session debugging happens outside of DAP because that use case must be supported anyway: if I want to debug an application that has processes implemented in C++ and Python, then two debug adapter types are needed and the client (e.g. VS Code) establishes independent sessions with these adapters via debug adapter factories that exist per debug type ("cpp" and "python").

The same approach is used when debugging multiple processes implemented in the same language (where only a single debug adapter type is needed): the client uses a single debug adapter factory to create sessions and it is an implementation detail of the factory (and the debug adapter) whether a single server process is used to implement multiple sessions via multiple communication channels or whether an independent debug adapter process is launched for each session.

With this approach the client (e.g. VS Code) does not have to deal with debug adapters that are able to handle multiple sessions. It just creates a DAP session via a factory and does not have to care about multiplexing DAP sessions over a single comm channel.

gfszr commented 2 years ago

@weinand Thank you for the quick reply! 🙂

Today multi-session debugging happens outside of DAP because that use case must be supported anyway: if I want to debug an application that has processes implemented in C++ and Python, then two debug adapter types are needed and the client (e.g. VS Code) establishes independent sessions with these adapters via debug adapter factories that exist per debug type ("cpp" and "python").

Considering DAP as the "generalization" of a debugger API, various debuggers (pydevd and gdb, for instance) allow the concept of "debugging child processes of the same language". However, neither pydevd nor gdb implement a feature of debugging child processes written in arbitrary languages (Well, maybe gdb does :P). If DAP's purpose is to generalize debuggers through a single protocol and session, then one of such features should be supporting debugging same-language child proccesses.

Multi-language debugging might sound similar to multi-session debugging, but the latter is a basic feature of a debugger (Thus, should be represented in DAP in some way), while the other is a fancier feature that is out of scope to a debugger (For which, VSCode can solve by using multiple DAP sessions).

With this approach the client (e.g. VS Code) does not have to deal with debug adapters that are able to handle multiple sessions. It just creates a DAP session via a factory and does not have to care about multiplexing DAP sessions over a single comm channel.

Such approach has disadvantages, which I believe DAP was designed to solve: for each debug adapter, the client would need to implement its notion of "multi-session", if one is multiplexing on the same connection or creating new connections. Such example is shown in both debugpy and vscode-js-debug, which both implement a different approach which is not only non-DAP, but also requires changes in the UI (VScode, or any other editor using DAP for that matter).

I'm a huge fan of DAP, but lacking a feature a lot of debuggers implement (multi-processing), defeats the purpose of not reinventing the wheel and write a compatability layer between a DAP server and its UI.

puremourning commented 2 years ago

The main issue with leaving this to the client is that only the Server knows when a new 'target' has become available and how to connect to it. This can be demonstrated by the status quo of de facto implementation in debugpy and vscode-js-debug. both of which use (different, naturally) custom messages and custom client-side code.

I think the request is to recognise that use case and standardise a mechanism to achieve it rather than each dap sever crating a custom one and each dap client having not have per-server custom code.

In the case of vscode-js-debug, the adapter is not even basically functional without implementing its custom undocumented protocol extensions.

weinand commented 2 years ago

I suggest that someone creates a concrete spec proposal for a new DAP target event.

gfszr commented 2 years ago

@weinand I'd like to thank you again for your quick responses!

I suggest taking a similar approach to vscode-js-debug's flatSessionLauncher.ts plus the reverse attach of debugpy, and formalizing everything our fellow colleagues stated above:

As stated above, such approach would have a lot of benefits:

One downside for such approach that I thought of is the root session transport being congested by many child sessions being multiplexed (locking, etc.). However, DAP doesn't generate a lot of traffic, so I do not believe this would cause issues.

Please feel free to comment and advise on the suggestion above 👍

connor4312 commented 2 years ago

Pavel noted rightly that there are two things here: the "reverse attach" request -- a request from the DA for the client to start a new debug session, and the specifying how the communication happens.

I think the reverse attach is sensible and fairly simple: a new request like startDebugSession that contains a launch configuration that the DA wants the client to start. This can be done without particular knowledge of hierarchy or session reuse.

The notion of hierarchy could be brought in through an child: boolean property on this request. If the hierarchy in this proposal functions as hierarchal sessions do in VS Code today, the effect of child: true is purely cosmetic and could in fact be ignored.

The request could additionally contain a connectionId: number that, if present with child: true, indicates the DA would like the UI to reuse the same connection/debug server. I would suggest using Eric's proposal, but renaming it to connectionId to avoid confusion, as sessionId is already in use.

Why not make it be the real "session ID"? (Which is not a concept of DAP itself, only one from VS Code.) My reasoning is that the session ID should not be assigned by the DA, but also that having it being returned from the DAP call introduces additional ordering complexity, since it must be returned before the new connection is made so that the DA can treat it correctly. We can remove the complexity by having the DA assign an integer, which is also more compact on the protocol than a UUID for instance.

With this, the "root" debug session would not have a connection ID passed to its calls, but children would.

interface StartChildSessionArguments {
    configuration: {
        type: string;
        request: string;
        [key: string]: unknown;
    },

    child?: boolean;

    connectionId?: number;
}

This would also need a corresponding supportsStartChildSession in the InitializeRequestArguments.

gfszr commented 2 years ago

@connor4312 Very nice! I do have a few comments:

mickaelistria commented 2 years ago

I don't think we need to deal with sessionId.connectionId or anything of that form as the spec already tells a bit about how things can work. It just misses some technical aspects to actually implement it. The only thing I think is missing for the spec to properly specify multi-sessions so that all adapters/clients can support it is agreeing on a port for a child session. According to the spec For every debug session, the development tool initiates a new communication session on a specific port, good. Then what we need is a way to know how to initiate a new communication session on a specific port. How to have the client and adapter agree on a port? Should the client pass the port as a parameter it in Initialize, or Launch/Attach? What is the first command the client must sent to the adapter in order to start a new session? I don't think a new custom command to start a session is necessary, do we? @weinand If you agree with this overall idea and can recommend what is best, I'll happily try to turn it into a concrete proposal as a PR.

puremourning commented 2 years ago

The client also needs to know the initialisation options and various other things, as included in @connor4312's proposal. But remember that the purpose of this particular proposal is to allow multiplexing sessions on a single transport, rather than requiring that all adapters working with multiple sessions use IP for transport. This is (presumably) based on the existing (de facto) implementation in vscode-js-debug (which I understand is actually used by Visual Studio (not code)), according to its author.

But I tend to agree that this multiplexing is an additional complexity that's not required for the spec to be functional - it's required for specific clients to be functional. Therefore I tend to agree with @mickaelistria that we only need to specify a ChildDebugSessionLaunched notification, which needs at least:

I believe that this is pretty much exactly what debugpy does.

Example:

Name: childSessionLaunched
Type: Notification
Arguments:
  connection: 
    hostName?: Address to connect to (local host used if not supplied)
    port: port to connect to (per multi-session specification today)
  request:
    type?: "launch" or "attach" (default: "attach"?)
    configuration: [ object containing the launch/attach arguments passed to the adapter  ]

I don't believe that anything needs to be specified wrt to initialise exchange, there's an interesting question about breakpoints (should the client re-send all breakpoints to the new child), but I think that's up to client implementations to decide what works best. Perhaps a future update could include a "file type" or something which might indicate to the client that the child session is in a different language, but that seems rather scope-creepy at this point.

I think if we had that (or similar) in the spec, then there would be no need for these connectionId or sessionId, unless there is a specific use-case where IP is not ideal. I can think of a few reasons where that might be the case:

Well, having written all of that, the multiplex-over-stdio does seem to have some key advantages.

As the existing multi-session support is already in the specification, I'd be tempted to add both approaches and let the client decide which it wants to use by capabilities. The above childSessionLaucnhed request can be extended with a connection type:

Name: childSessionLaunched
Type: Notification
Arguments:
   sessionId?: if supplied, multipllex on the current session. connection block must not be supplied
  connection: 
    hostName?: Address to connect to (local host used if not supplied)
    port?: port to connect to (per multi-session specification today)
  request:
    type?: "launch" or "attach" (default: "attach"?)
    configuration: [ object containing the launch/attach arguments passed to the adapter  ]

We can of course include the child property for the reasons @connor4312 includes, though I'm not sure when this would ever be false in practice. I don't think the 'hierarchy' aspect is unique to either launch approach. I think the only advantages of the multiplexing over the existing multi-session are the security, portability and routing stuff above.

TL;DR - agree mostly with @connor4312's proposal, but (probably) for different reasons.

gfszr commented 2 years ago

@puremourning Although de facto in current implementation of debugpy, using a different transport (Especially a concrete typed one, such as IP+Port) than the current session isn't something we should encourage. Standardizing multiplexing for supporting child sessions, although more complex but not by much, will eventually benefit the user and the client developer, not only from the very good reasons you specified but:

If a DAP server doesn't want to implement multiplexing support - it's OK, it can disable support for childSessionLaunch handling in the capabilities which will backtrack to the current experience. However, multiplexing will be - by far, the best option.

puremourning commented 2 years ago

The existing (spec'd) multi-session approach isn't going to disappear, not least because it's very convenient for remote debugging, etc. So clients have to implement that anyway; I don't see any significant extra complexity on the client side.

Speaking as the author of a client, in my client multiplexing is more challenging than completely separate comms.

weinand commented 2 years ago

I don't like the fact that in this feature request "starting new DAP sessions from within a debug adapter" and "multiplexing DAP sessions on a single transport" are mixed together. The latter is just an optimisation which does not result in any architectural benefits, whereas the former is an important architectural feature because it allows debug adapters to handle multi-session debugging without need for non-standard code in VS Code debug extensions.

If we combine both sub-features into one, I see the problem that a client like VS Code will have a hard time to adopt the full feature because it will require significant changes in lots of places. This means that we end up with a DAP feature that DAs can already use, but which will not be supported by VS Code for quite some time (and therefore still requires the existing approach as a fallback).

If we could agree on separating the "multiplexing" then we could focus on the "starting new DAP sessions from within a debug adapter" which I believe will result in a fairly small change in clients (including VS Code).

But the tricky part here is to avoid talking about concrete communication channels like "ports" and "hosts". Remember the debug adapter factories available in VS Code's extension APIs allow for sockets, named pipes, and even inline implementations that do not use a communication channel at all because the DA is fully implemented in the extension...

mickaelistria commented 2 years ago

But the tricky part here is to avoid talking about concrete communication channels like "ports"

The specifically already tells about port in https://microsoft.github.io/debug-adapter-protocol/overview#debug-session-start . Are you willing to roll back that part for something more abstract? I personally wouldn't mind if we stick to ports, it's standard enough to be usable for all clients and adapters and pragmatically efficient.

weinand commented 2 years ago

@mickaelistria port does not occur anywhere in the spec because the spec does not cover how a DA is launched and how a communication channel to the DA is established.

For security reasons some debug extension authors refuse to use ports and rely on namedPipes... And I don't think that we can add concrete communication mechanisms to the spec.

puremourning commented 2 years ago

@mickaelistria port does not occur anywhere in the spec because the spec does not cover how a DA is launched and how a communication channel to the DA is established.

somewhat of an aside, but I've always considered https://microsoft.github.io/debug-adapter-protocol/overview as normative (in particular the base protocol, initialisation etc. sections). Is that not the case?

weinand commented 2 years ago

The DAP overview does not cover all possible ways how a client can run a DA and communicate with it.

gfszr commented 2 years ago

I agree with @weinand, DAP must not specify or rely inherently on a specific transport. We might be able to split both suggestions to two PR's as you said which would look like this:

Supporting the first PR would still keep, for the time being, the status quo of each adapter having it's own way of initiating a new session (thus, non-standard code...), but at least a standard way to report that such was created.

I do believe however, although not necessarily prioritized in the VSCode team, that we should standardize multiplexing, perhaps as a new capability, even if VSCode won't support it at the beginning. The current situation is that VSCode has custom code and there are adapters which aren't DAP compatible, and standardizing multiplexing will eventually, in the near future, will improve the situation.

connor4312 commented 2 years ago

I agree we can separate the multiplexing: I mention both because that is what the issue is about, and multiplexing requires knowledge of parent/child sessions, since multiplexed children cannot outlive the parent...

But the tricky part here is to avoid talking about concrete communication channels like "ports" and "hosts". Remember the debug adapter factories available in VS Code's extension APIs allow for sockets, named pipes, and even inline implementations that do not use a communication channel at all because the DA is fully implemented in the extension...

I think this is something that DAP should not care about at all. Debug servers can be launched in different ways by different clients. In the simplest client model without multiplexing, this would result in a new DA executable being spawned. However, there are numerous ways that DA could be set up to delegate to its parent. For example, if a child DA gets spawned, its parent could pass down a pipe/socket name on which the child can communicate inside its launch configuration. This is what I would do in js-debug if I support this before multiplexing exists.

In more complex clients like VS Code where the extension has control over how the DA communication is handled, this could be more efficient. And of course if/when multiplexing is added, it become more efficient yet.

though I'm not sure when this would ever be false in practice

True, maybe the child: true behavior is implicit.

weinand commented 2 years ago

@connor4312 you said

I think this is something that DAP should not care about at all. ...

I totally agree, that was the argument I tried to bring across with the "But the tricky part here is ..."

mfussenegger commented 2 years ago

Looking at the two existing off-spec extensions I don't think multiplexing is necessary.

Debugpy uses a custom event to spawn sessions:

interface DebugpyAttachEvent extends Event {
  event: "debugpyAttach";

  body: {
    /**
     * Fully formed attach configuration that can be used with VS Code 'debug.startDebugging' API.
     */
  };
}

A sample payload:

  body = {
    connect = {
      host = "127.0.0.1",
      port = 36261
    },
    console = "integratedTerminal",
    isOutputRedirected = false,
    name = "Subprocess 351514",
    program = "/path/to/myapp.py",
    python = { "/usr/bin/python" },
    request = "attach",
    subProcessId = 351514,
    type = "python"
  },
  event = "debugpyAttach",
  seq = 11,
  type = "event"
}

vscode-js-debug uses a attachedChildSession reverse request. The body also contains a full configuration and an extra __jsDebugChildServer child-port property.

This already raises one question: Event or reverse request. If reverse request, what would be the response?

Both use TCP as transport, and a client implementing support needs to know about the structure. I don't see a way to make this transport agnostic. The client needs to know under which transport the communication should happen. I think a pragmatic option could be to define something like

transport = {
  type = 'tcp' -- Could add other transports later
  -- remaining properties depend on type. E.g:
  host?: string
  port?: number
}

Or something like:

transport = {
  type = 'tcp' -- Could add other transports later
  -- payload depends on type. E.g:
  payload = {
    host: string
    port: number
  }
}

As far as I can tell there's no connectionId or sessionId requirement.

There's also no child property and I don't think it's necessary. The event or reverse request happens within a session. This session is implicitly the parent. This does allow to form hierarchies with multiple levels. Is there a use case where a debug adapter would want to prevent this?

there's an interesting question about breakpoints (should the client re-send all breakpoints to the new child), but I think that's up to client implementations to decide what works best

I think it would be important to have this in the specification. From what I can tell from playing with debugpy the setBreakpoints requests needs to go to a child session. Sending it only to the root leaves the breakpoints non-functional. It's unclear to me whether it would be save to broadcast requests. Having this in the specification would eliminate the uncertainty and more importantly it prevents debug adapters having requirements that are incompatible with others. (E.g. one allowing broadcasts, another not supporting it)

weinand commented 2 years ago

Based on discussions in this issue (and elsewhere: https://github.com/microsoft/vscode/issues/116730, https://github.com/microsoft/vscode-js-debug/issues/902), I've tried to come up with a minimal proposal:


Reverse Request startDebugging

This request is sent from the debug adapter to the client to start a new debug session. This request should only be called if the corresponding capability supportsStartDebugging is true.

A client implementation of startDebugging should start a new debug session in the same way as a debug session is started by the user.

If the client (UI) supports hierarchical debug sessions, the newly created session can be treated as a child of the session that has called startDebugging.

interface StartDebuggingRequest extends Request {
  command: 'startDebugging';

  arguments: StartDebuggingRequestArguments;
}

Arguments for startDebugging request.

interface StartDebuggingRequestArguments {
  /**
   * Arguments passed to the new debug session.
   */
  configuration: {
    type: string;
    request: 'launch' | 'attach';
    [key: string]: unknown;
  }
}

Response to startDebugging request.

interface StartDebuggingResponse extends Response {
  body: {
    // ... TBD (if any)
  };
}

Notes:


I'd appreciate any feedback

puremourning commented 2 years ago

I did not add a "transport" argument because I do not see how a client like VS Code would make use of it. The VS Code implementation of the startDebugging request would just call the internal version of the extension API "startDebugging" and that method does not have a "transport" argument either. However, if resolving the debug type of the passed configuration property results in the same debug extension, and if that extension runs the DA as a server process that supports multiple sessions, then the new session will end up in the same DA process as the parent session that had called startDebugging. And by definition that DA process would understand any additional private arguments that the caller (the same process) had passed to the receiver. So there is no need to specify these (private) arguments in DAP.

I would like to see how you propose this would be used in the actual concrete cases we know about. For example, it's not obvious to me how the debugpy use case (where it instructs the client to connect to a specific host/port and initialise using the supplied arguments) would be implementable in this case.

I think it's important to distinguish that the contents of the configuration block must strictly map to the contents of the launch or attach request's arguments property and not the unspecified and client-specific contents of something like vscode's launch.json entries (these things being completely separate, albeit overlapping in practice).

If the assumption is that the client can just "push the configuration through whatever mechanism it uses to launch adapters normally", then we're requiring that each adapter understands the configuration methods and specifications of each and every client. I don't think that's likely what you meant, so I must have misunderstood.

Also, what would be the specification for this: type: string; ?

connor4312 commented 2 years ago

For the debugpy case, is the 'parent' DA that triggers attachment is running locally?

For js-debug, attachment will always get served by the same DA, so I have no need of transport information there.

puremourning commented 2 years ago

For js-debug, attachment will always get served by the same DA, so I have no need of transport information there.

will this continue to use the session ID customisation?

weinand commented 2 years ago

@puremourning thanks for your comments/questions.

I answer the simple questions first:

I think it's important to distinguish that the contents of the configuration block must strictly map to the contents of the launch or attach request's arguments property and not the unspecified and client-specific contents of something like vscode's launch.json entries (these things being completely separate, albeit overlapping in practice).

Yep, that's correct and in the comment section of my proposal I already mentioned "variable substitution" as one feature that should not be used in the "configuration block". I'll add your point to the proposal.

Also, what would be the specification for this: type: string; ?

That's the "debug type", which raises the question whether we want to support the creation of debug sessions of other types than the caller's type. I think this is an open question that we need to discuss.

If the assumption is that the client can just "push the configuration through whatever mechanism it uses to launch adapters normally", then we're requiring that each adapter understands the configuration methods and specifications of each and every client. I don't think that's likely what you meant, so I must have misunderstood.

Today launch configs do not include anything for controlling how the underlying DA is launched in the client. Launching DAs is the sole responsibility of the client (or of "extensions" if the client decides to delegate the problem). If we stick with that model for the new startDebugging request, then your statement "... we're requiring that each adapter understands the configuration methods and specifications of each and every client" is not true. The DA (as the caller of startDebugging) just wants to create a new debug session and does not care how that session "comes to life". If the called DA runs as a server, then the new session might end up in the same process as a the caller DA. If the called DA only supports single session mode, the new session will end up in another process.

I would like to see how you propose this would be used in the actual concrete cases we know about. For example, it's not obvious to me how the debugpy use case (where it instructs the client to connect to a specific host/port and initialise using the supplied arguments) would be implementable in this case.

Based on my previous answer I would think that passing a "host/port" is not needed. See Connor's comment above.

mfussenegger commented 2 years ago

I am still not sure I understand how this is supposed to work. Sticking to the debugpy case:

According to your proposal:

A client implementation of startDebugging should start a new debug session in the same way as a debug session is started by the user.

If I understand this correctly, it would remove the option for the debug adapter to handle sub-sessions via a different transport. E.g. debugpy currently uses TCP for that. I'm not familiar with its implementation, but I can imagine that having two separate processes would make things challenging for them? It's probably easier to handle everything within the same process - which kinda requires a transport other than stdio, unless one would open up the multiplexing topic.

From a client perspective it sounds feasible to implement.


One other thing that just came to mind is that some debug adapters can be started with a fixed port argument. E.g. codelldb --port 1234. There would be no way the client could start another session the same way as the port would conflict.

weinand commented 2 years ago

@mfussenegger what you describe is exactly what happens if the client (or the extension) only supports DAs in single-session mode. And yes, for the caller of startDebugging two separate processes "would make things challenging". But this scenario we will need to support anyways for backward compatibility.

Now consider a client (or extension) that runs the DA in multisession mode, e.g. as a TCP server. In this case startDebugging will result in a session that runs in the same process as the first session (the caller). Now the first and the second session could detect that they are running in the same process, which might result in better performance, more feature, or whatever etc.

Now consider a client (or extension) that runs the DA in multisession mode but based on multiplexing the sessions over a single comm channel, e.g. stdin/stdout. Same story as in the TCP implementation.

puremourning commented 2 years ago

In the multi-session mode, the client has to know how to communicate with the DA. In practice, the only way that happens is via a host/port. If the multi-session DA wants to start a new session it needs to either:

From the "overview" (non normative) which is the only place multi-session is mentioned:

multi session mode: in this mode, the development tool does not start the debug adapter but assumes that it is already running and that it listens on a specific port for connections attempts. For every debug session, the development tool initiates a new communication session on a specific port and disconnects at the end of the session.

Am I missing something, but how will the client discover the new port (in the first case), and in the later, we still have the same multiplexing issues as stdio.

Edit: Actually, I suppose it could have multiple sessions on the existing port - that's actually simple enough. not sure what I was thinking. there's no need for separate ports with a single adapter process.

mfussenegger commented 2 years ago

But this scenario we will need to support anyways for backward compatibility.

Why is that relevant for backward compatibility?

Now consider a client (or extension) that runs the DA in multisession mode but based on multiplexing the sessions over a single comm channel, e.g. stdin/stdout. Same story as in the TCP implementation.

Is there any client or debug adapter which currently supports multiplexing over stdio? How does that work without some sort of session id?

For TCP it's possible given that you've separate sockets.

Now consider a client (or extension) that runs the DA in multisession mode, e.g. as a TCP server. In this case startDebugging will result in a session that runs in the same process as the first session (the caller). Now the first and the second session could detect that they are running in the same process, which might result in better performance, more feature, or whatever etc.

Okay, so if the debug adapter uses TCP, it can send startDebugging and the have the client connect to the same host/port.

But then the part in the proposal:

A client implementation of startDebugging should start a new debug session in the same way as a debug session is started by the user.

Is a bit misleading - or at least incomplete. Because the client can do both: Spawn a process and then connect via TCP. This is at least what nvim-dap currently supports for debug adapters like codelldb or delve. If I understood the replies right, in the startDebugging case it must not spawn another process but instead only connect to the same host/IP.

This somehow also makes this at least somewhat transport specific, without making it explicit. I don't really understand the advantage to this versus adding the transport information to the reverse request. It's not like this is a moving target, we've stdio, tcp, and named sockets. Clients/debug adapters also wouldn't have to support all of them.

I think it would be good to have some feedback from the debug adapters authors on this. I can imagine that any debug adapter using primarily stdio won't be able to use this without significant effort to make it work.

weinand commented 2 years ago

@puremourning you said:

Am I missing something, but how will the client discover the new port

Let's take VS Code as an example for a client:

VS Code knows nothing about ports, but it knows that specific extensions provide factories for debug adapter sessions. When a new debug session for a specific debug type is needed, VS Code delegates that request to the extension that implements that type, and lets the factory create the debug adapter session (which is then wrapped by VS Code as a debug session).

The factories determine how the DA is launched and used. E.g. single process per session, or server process supporting multiple sessions, or multiple sessions running inside the extension (without need for TCP sockets and no need for multiplexing the protocol). All this are implementation details and I think we should keep them out of the DAP spec.

The only thing we are now trying to add to the spec is basically the "abstract debug adapter session factory" in form of the startDebugging request. But since we cannot make any assumptions about the implementation details (e.g. "ports"), we cannot make guarantees that debug adapter sessions end up in the same process and are multiplexed over a single communication channel. But if the client and the DA support this, then multiple sessions will end up in the same process.

weinand commented 2 years ago

@mfussenegger

Why is that relevant for backward compatibility?

If a DA calls startDebugging for a debug type that does not yet support multi-sessions in a single process, then multi-sessions will be implemented as separate processes.

Okay, so if the debug adapter uses TCP, it can send startDebugging and the have the client connect to the same host/port.

There is no requirement that TCP must be used in order to have multiple sessions served from a single DA. The transport doesn't matter at all.

If I understood the replies right, in the startDebugging case it must not spawn another process but instead only connect to the same host/IP.

If the client decides how DAs are launched (e.g. single session or as servers), it should know best how to create new debug sessions. But it cannot be a "must" since it depends on the specifics of the DA implementation. See my point about backward compatibility above.

This somehow also makes this at least somewhat transport specific, without making it explicit. I don't really understand the advantage to this versus adding the transport information to the reverse request. It's not like this is a moving target, we've stdio, tcp, and named sockets. Clients/debug adapters also wouldn't have to support all of them.

you can easily run a stdio based DA (implemented in JS/TS) inside a VS Code extension and serve multiple sessions without sockets, named pipes and multiplexing. The transport is an implementation detail that does not belong into the DAP spec.

I can imagine that any debug adapter using primarily stdio won't be able to use this without significant effort to make it work.

I don't think that this is true. It is trivial to wrap a stdio-based DA in some TCP socket listener code that serves multiple sessions. Here is the code that I use in the npm adapter module: https://github.com/microsoft/vscode-debugadapter-node/blob/4531e36e56729468c17a50d5cc00d89c8c64f981/adapter/src/runDebugAdapter.ts#L24-L33

puremourning commented 2 years ago

I think it would be good to have some feedback from the debug adapters authors on this.

I wrote a much longer post, which amounted to this. With most DAP proposals, it's typical to ask for a concrete client and server implementation. I think it's worth engaging with the DA authors like @int19h as to what exactly their adapters would send, then client authors like me an @mfussenegger et al can determine if we would be able to support them generically.

One fear I have is that if we put something in the protocol that requires DA vendors to re-architect their existing solutions, then they will be unlikely to do so, and persist with out-of-band stuff...

weinand commented 2 years ago

@mfussenegger @puremourning now I have a question for you as "client authors":

How exactly are you consuming DAs? I assume that you do not use VS Code extension code and that you are just running a DA in single debug session mode via stdio. That means you can easily implement the reverse request startDebugging but there would be no chance that a child debug session would end up in the same DA process.

I think the missing piece here is the DebugAdapterDescriptorFactory since it determines how the DA is run.

So if DA authors want to support multi sessions then they could use the DebugAdapterServer or DebugAdapterNamedPipeServer which specifies the port/named pipe to listen to.

Or do you make use of that factory already?

puremourning commented 2 years ago

do not use VS Code extension code

correct.

you are just running a DA in single debug session mode via stdio

sometimes, but not always. Users configure Debug Adapters by specifying properties of how to launch them, e.g.:

Then they tell us how to communicate with it, e.g.:

And finally, they tell us the launch configuration :

So basically, all the things that aren't in the protocol, we require users to configure. I think that emacs-dap and nvim-dap and vimspector are fairly similar in this respect. (I can only speak for vimspector of course).

I think the missing piece here is the DebugAdapterDescriptorFactory

The equivalent is the above configuration. Example, here's fleshed and simplified out spec from vimspector for an example debugpy session:

    "run current script": {
      "adapter": {
      "command": [
        "/usr/local/opt/python@3.10/bin/python3.10",
        "${gadgetDir}/debugpy/build/lib/debugpy/adapter"
      ],
      },
      "configuration": {
        "request": "launch",
        "type": "python",
        "cwd": "${CWD:${workspaceRoot\\}}",
        "program": "${file}",
        "stopOnEntry": true,
        "console": "integratedTerminal",
        "args": [ "*${args}" ],
        "python": "${python}"
      },
    },

The equivalent for an attach looks like this:

      "adapter": {
        "host": "${hostName}",
        "port": "${port}"
      },
      "configuration": {
        "request": "attach"
      },

An indteresting case is Delve which requires launching, and using TCP (not stdio), and uses the TTY of delve itself for the debugee:

    "adapter": {
      "command": [
        "${gadgetDir}/delve/bin/dlv",
        "dap",
        "--listen",
        "${listenOn}:${port}",
        "*${dlvFlags}"
      ],
      "port": "${port}",
      "tty": true,
    }

Anyway point is that it's all driven by this metadata - there's no "VimspectorDelveConfiguratorFactorySingleton" and "VimspectorDebugpyConfiguratorAbstractFactory" , or whatever. There's just generic code which does some actions to launch the thing then talks DAP at it, the user providing the necessary instructions.

In practice, Vimspector comes with some configurations of a handful of adapters to help users as this is not simple configuration, but users can set up their own using any DA. I believe that similar is done in nvim-dap , perhaps with a plugin system of some kind.

It's certainly possible that for multi-session debugging to work, moar configuration and/or instructions are required but I'm just not clear yet on exactly what that would be.

weinand commented 2 years ago

@puremourning thanks a lot for the info. That is very helpful.

So if a DA is configured with communication option Please make a tcp connection to port ${port} and host $host", then all DA sessions would end up in the same process. And for startDebugging a child session would have access to its parent session, correct?

mfussenegger commented 2 years ago

I assume that you do not use VS Code extension code

Correct

How exactly are you consuming DAs?

Similar to vimspector, nvim-dap lets users declare them in a somewhat declarative way.

For example, if a user wanted to use debugpy, they declare an adapter like this:

dap.adapters.debugpy = {
  type = 'executable',
  command = '/usr/bin/python',
  args = {'-m', 'debugpy.adapter'}
}

When the user starts a session, it will run the executable and communicate via stdio.

If they want to use a debug adapter that supports TCP, users have two options:

1) Run the debug adapter manually in a terminal, and define an adapter like this:

dap.adapters.codelldb = {
  type = 'server',
  port = 1234,
  host = '127.0.0.1'
}

In this case if the users starts a debug session it tries to connect to the given port/host via TCP.

2) Define an adapter like this, to have nvim-dap spawn it whenever the user starts a debug session:

dap.adapters.codelldb = {
  type = 'server',
  port = '${port}',
  host = '127.0.0.1',
  executable = {
    command ='/usr/bin/codelldb',
    args = {'--port', '${port}'}
  }
}

In this case nvim-dap generates a free port number, spawns the process and connects to it via TCP.

There are some more advanced options, but this is pretty much the gist of it. Named sockets are currently not supported. (Mostly because so far I only saw the ruby debug adapter supporting it, and it works with TCP as well)

Users also need to create configurations for launch/attach. That works in a similar way or by loading vscode launch.json files (except only JSON, not JSON5, and only a subset of variables is supported; no tasks/inputs)

Opposed to vimspector, nvim-dap doesn't ship with any definitions, but only provides documentation on how to setup the most common ones. There are additional nvim-dap extensions which provide definitions (& some configurations) - usually per language, but users don't need to use them.

puremourning commented 2 years ago

So if a DA is configured with communication option "Please make a tcp connection to port ${port} and host $host", then all DA sessions would end up in the same process.

I guess theoretically, assuming that vimspector is taught not to launch a new instance in the "Pleaser launch this process, then connect to it on this port" case (codelldb, delve). Not sure how it would know to do that, but we can teach it if that's what's required, but it would be an heuristic - either the user or the DA needs to tell it to disregard part of the initial instruction.

mickaelistria commented 2 years ago

What Eclipse IDE DAP client does is similar to what is described by other clients: it doesn't use any VSCode lib and the configuration of the client is a mix of automated adapter-specific integration (for things like how to start the adapter, deciding of a port in case debug adapater uses TCP instead of -preferred- stdio, and other "low-level" settings users don't need to care about) plus some user configuration (for "applicative" parts like launch vs attach, arguments and so on) which are similar to the content of launch.json in VSCode. FWIW, my concern as DAP client maintainer is not really about sessions or processes or even transport. users don't really care if it's one or several debug adapter processes running, it it's stdio or TCP, nor even of sessions... What they want is just for things to work and at the moment, the issue is that some adapters (eg vscode-js-debug) cannot work because they require some operations or integration particularities that are not spec'd at all and that it's unclear whether and how to support them in the client. My hope is only that the necessary operations for such things to work can become standardized so they can be reliably implemented. So from a client perspective, I believe that 1 or more debug adapter processes really do not matter, and that client could work with either as easily once everything is well specific. The decision of whether to encourage multiple adapter sessions or not is probably more important for adapters themselves than for clients. About adding the launchType as extra argument; I do not see it as an immediate requirement in order to get vscode-js-debug working, so I think we can ignore it for the moment. But I may have misunderstood how vscode-js-debug works so if you think it's clearly better to keep it, it's fine to keep it, it wouldn't be a major annoyance for client.

weinand commented 2 years ago

@mickaelistria you raised an important point (that got lost in the discussion from above:

... the issue is that some adapters (eg vscode-js-debug) cannot work because they require some operations or integration particularities that are not spec'd at all and that it's unclear whether and how to support them in the client. My hope is only that the necessary operations for such things to work can become standardized so they can be reliably implemented.

@connor4312 and @int19h do you think, that the debug adapter of js-debug and debugpy could make use of the new startDebugging request, so that some of the "operations or integration particularities" could be removed from the two extensions in favour of the new implementation of startDebugging in the client?

connor4312 commented 2 years ago

Yes for js-debug; that should allow it to be usable by a pure DAP client without additional trickery.

roblourens commented 2 years ago

variable substitution is an implementation detail of a client and not part of the DAP spec. So a DA cannot/should not rely on this functionality.

But if the user configures variables in their launch.json, and this trickles down to a child debug session, then I suppose that's ok right?