Supporting the Debug Adapter Protocol

At QuantStack, we are working on a prototype of debugger for the Jupyter ecosystem. This issue tracks the changes we need to make to get it done; some of them might be integrated independently from the debugger if they are considered as a global improvement.

The idea is to reuse the Debug Adaper Protocol from Microsoft. These messages are wrapped in new Jupyter messages that are sent on the Control channel.

1] Changes in the protocol

[x] The Shell and Control channels should accept different messages; AFAIK this is somehow the current implementation, where most of the messages are sent on the Shell, while shutdown_request and interrupt_request are sent on the Control, however it would be clearer to have it formaly stated in the protocol itself. @jasongrout opened #388 to update the documentation along these lines.
[x] Three new messages are added to the protocol: debug_request, debug_reply and debug_event. debug_request wraps a request described in the Debug Adapter Protocol and is sent on the Control channel exclusively. It expects a debug_reply wrapping a response described in the Debug Adapter Protocol. debug_event is sent by the kernel on the IOPub channel. Having the messages from the Debug Adapter Protocol wrapped in Jupyter Messages avoids copying the full specification from Microsoft and "polluting" the Jupyter specification with a lot of new messages that might be ambiguous (terminate, configurationDone, ...). Done in #464, and #502.

2] Changes in the frontends

Frontends should be able to send messages directly on the Control channel, which is currently not exposed at all:

[x] JupyterLab should expose a sendControlMessage as well as Interfaces for messages sent on the Control channel in the @jupyterlab/services package.
[x] JupyterLab should expose specific methods to send debug_request messages, handle debug_reply and debug_event messages in the @jupyterlab/services package.
[x] The Notebook server should publicly expose the control channel
[x] The Jupyter server should publicly expose the control channel
[x] The Control channel should be exposed by jupyter_client classes that are used in the implementation of the Notebook Services and the Jupyter Server Services. In the current implementation, the frontend sends all the messages on the Shell channel, and some messages such as shutdown_request get a special handling to be sent on the Control channel.
[x] Design a UI ;) (for the prototype we will quickly make a UI to test) EDIT: UI is available at https://github.com/jupyterlab/debugger/

Optional:

[ ] The Notebook server should expose the same methods as JupyterLab in notebook/services/kernels/kernelmanager.py. Messages whose channel attribute is control should be rerouted to these methods.
[ ] The Jupyter server should be updated the same way.

3] Changes in the backends

The required changes are specific to the implementations of the different kernels. However a same approach can be used:

Having a separate handling of Control and Shell channels
Choose a debugger and implement The Debug Adapter Protocol to communicate with it
Unwrap request from received debug_request messages and reroute them to the debug adapter. Wrap response into debug_reply messages, and events into debug_event messages.

As an example, we are using xeus and xeus-python to experiment with PTVSD.

cc @jasongrout @SylvainCorlay @afshin @ivanov @martinRenou @wolfv

EDIT: Actually exposing methods to handle debug messages in the Notebook server is not mandatory if we don't plan to add the debugger to the notebook. Exposing the control channel is enough as long as JupyterLab depends on the Notebook server. Exposing the control channel in the Jupyter server is useful if JupyterLab switches from the Notebook server to the Jupyter server at some point.

The "phase1" of the development in xeus-python is finished. It can now start and stop an instance of ptvsd, many times in the same session, and it can forward messages from ptvsd to the frontend and vice versa.

1] Mapping Notebook cells to files

Now comes the problem of mapping Notebook cells to files. When setting a breakpoint, a path to a file containing the code and a line number must be specified. The file must exist, otherwise ptvsd won't break. This means that the kernel must create a file or several files containing the code of the Notebook cells. Besides, setting a breakpoint in a file actually removes all the breakpoints previously set, so it is mandatory to specify a list of breakpoints when sending a breakpoint request.

Typical usage of the debugger is to set breakpoints before starting the debugging sessions itself. Another use case is when a cell that already contains a breakpoint is modified before being executed again. Considering the constraints on breakpoints previously mentioned, the following sequence should allow to handle all the scenarios:

send the code of the cell in a debug_request message. This is an additional message to the Debug Adapter Protocol, specific to Jupyter. The kernel creates a file containing this code or updates it. The reply contains the path to the file (created or updated).
send the list of breakpoints for this cell; this message makes use of the file name returned in the previous reply.
send the execute_request

Since this requires sending additional messages to the kernel, this should be done only when a debug session has started. This means that either the user cannot set breakpoints before he has started the debugger, or that the frontend sends all the cells and the existing breakpoint list upon debugger start.

2] Implementation of the mapping in the kernel

Although this is kernel specific and should not impact the design of the protocol, I think it might be interesting to dicuss it here. Two approaches are possible:

either the kernel maintains a single file that is updated while the cells are modified. This makes the mapping more complicated since we have to compute offsets for the line numbers.
or the kernel maintains one file per cell. The file name can be based on the hash of the code or the cell's number. This is simpler than the previous solution, however this can significantly increase the number of opened files. We can limit them if the debug_request message for a cell update contains the current cell's number so that the kernel can remove the mapping file before creating a new one.

I am not entirely clear with all the potential limitations we might still encounter with ptvsd, so I will play a bit with it and update this accordingly.

3] Stepping into imported files

This is specific to the frontend. When stepping into a function that is defined in an imported file, the frontend should open this file in text mode and allow to set breakpoints in it.

jupyter / jupyter_client