Open JohanMabille opened 5 years ago
Very nice! Long time needed. Thank you all for working on this and contributing to the project!
The "phase1" of the development in xeus-python
is finished. It can now start and stop an instance of ptvsd, many times in the same session, and it can forward messages from ptvsd to the frontend and vice versa.
Now comes the problem of mapping Notebook cells to files. When setting a breakpoint, a path to a file containing the code and a line number must be specified. The file must exist, otherwise ptvsd
won't break. This means that the kernel must create a file or several files containing the code of the Notebook cells. Besides, setting a breakpoint in a file actually removes all the breakpoints previously set, so it is mandatory to specify a list of breakpoints when sending a breakpoint request.
Typical usage of the debugger is to set breakpoints before starting the debugging sessions itself. Another use case is when a cell that already contains a breakpoint is modified before being executed again. Considering the constraints on breakpoints previously mentioned, the following sequence should allow to handle all the scenarios:
debug_request
message. This is an additional message to the Debug Adapter Protocol, specific to Jupyter. The kernel creates a file containing this code or updates it. The reply contains the path to the file (created or updated).execute_request
Since this requires sending additional messages to the kernel, this should be done only when a debug session has started. This means that either the user cannot set breakpoints before he has started the debugger, or that the frontend sends all the cells and the existing breakpoint list upon debugger start.
Although this is kernel specific and should not impact the design of the protocol, I think it might be interesting to dicuss it here. Two approaches are possible:
either the kernel maintains a single file that is updated while the cells are modified. This makes the mapping more complicated since we have to compute offsets for the line numbers.
or the kernel maintains one file per cell. The file name can be based on the hash of the code or the cell's number. This is simpler than the previous solution, however this can significantly increase the number of opened files. We can limit them if the debug_request
message for a cell update contains the current cell's number so that the kernel can remove the mapping file before creating a new one.
I am not entirely clear with all the potential limitations we might still encounter with ptvsd, so I will play a bit with it and update this accordingly.
This is specific to the frontend. When stepping into a function that is defined in an imported file, the frontend should open this file in text mode and allow to set breakpoints in it.
At QuantStack, we are working on a prototype of debugger for the Jupyter ecosystem. This issue tracks the changes we need to make to get it done; some of them might be integrated independently from the debugger if they are considered as a global improvement.
The idea is to reuse the Debug Adaper Protocol from Microsoft. These messages are wrapped in new Jupyter messages that are sent on the Control channel.
1] Changes in the protocol
[x] The
Shell
andControl
channels should accept different messages; AFAIK this is somehow the current implementation, where most of the messages are sent on the Shell, whileshutdown_request
andinterrupt_request
are sent on the Control, however it would be clearer to have it formaly stated in the protocol itself. @jasongrout opened #388 to update the documentation along these lines.[x] Three new messages are added to the protocol:
debug_request
,debug_reply
anddebug_event
.debug_request
wraps a request described in the Debug Adapter Protocol and is sent on the Control channel exclusively. It expects adebug_reply
wrapping a response described in the Debug Adapter Protocol.debug_event
is sent by the kernel on theIOPub
channel. Having the messages from the Debug Adapter Protocol wrapped in Jupyter Messages avoids copying the full specification from Microsoft and "polluting" the Jupyter specification with a lot of new messages that might be ambiguous (terminate
,configurationDone
, ...). Done in #464, and #502.2] Changes in the frontends
Frontends should be able to send messages directly on the Control channel, which is currently not exposed at all:
sendControlMessage
as well as Interfaces for messages sent on the Control channel in the@jupyterlab/services
package.debug_request
messages, handledebug_reply
anddebug_event
messages in the@jupyterlab/services
package.jupyter_client
classes that are used in the implementation of the Notebook Services and the Jupyter Server Services. In the current implementation, the frontend sends all the messages on the Shell channel, and some messages such asshutdown_request
get a special handling to be sent on the Control channel.Optional:
notebook/services/kernels/kernelmanager.py
. Messages whosechannel
attribute iscontrol
should be rerouted to these methods.3] Changes in the backends
The required changes are specific to the implementations of the different kernels. However a same approach can be used:
debug_request
messages and reroute them to the debug adapter. Wrap response intodebug_reply
messages, and events intodebug_event
messages.As an example, we are using xeus and xeus-python to experiment with PTVSD.
cc @jasongrout @SylvainCorlay @afshin @ivanov @martinRenou @wolfv
EDIT: Actually exposing methods to handle debug messages in the Notebook server is not mandatory if we don't plan to add the debugger to the notebook. Exposing the control channel is enough as long as JupyterLab depends on the Notebook server. Exposing the control channel in the Jupyter server is useful if JupyterLab switches from the Notebook server to the Jupyter server at some point.