Breaking in StdLib while not stepping into

echarles commented 2 years ago

JupyterLab 3.3.0a3 has a kernel sources panel which displays the loaded py files loaded (so the core python libraries, and userland libraries). This aims to offer the user the ability to add breakpoints and debug to see what happens in those userland libraries.

https://github.com/ipython/ipykernel/pull/812 has added the needed option to be able break into those stdlib files, while https://github.com/ipython/ipykernel/pull/839 has removed it. The removal is due to https://github.com/ipython/ipykernel/issues/832 where the step was going to the line in the call stack. Interesting to note, today the side effect describe in https://github.com/jupyterlab/jupyterlab/pull/11566#issuecomment-983668657 (breaking in selector.py) has disappeared.

Thinking a bit more, the behavior is expected. When we say to debugpy to break into all files, we also say to step into all files, meaning that the taken step goes to the next in the call stack, including the StdLib files.

@JohanMabille You say on https://github.com/ipython/ipykernel/issues/832#issuecomment-1010414146 that you would think to a solution ...changing the step over behavior, or filter stop events. Is there anything we can do in the short therm?

echarles commented 2 years ago

Going back to the root of the usecase, the need to break in the userland/stdlib libraries is for users willing to inspect what is going in those areas (variables...). I may say that it would be normal for such cases to just step into those code, so the reported issue in that case is not an issue, but a feature.

If we agree on that, A first question would be to give the user the option to start a session with or without stdlib:

A setting
A second button
A drop-down button (better, but not available today in the lumino toolkit)

A second question is how can the kernel be informed of the user choice. As we manage the protocol between the frontend and the server, this is doable at first sight.

JohanMabille commented 2 years ago

We want the user to be able to step into the stdlib, not to force him to step into the internals of ipykernel. When you debug, if you click next when stopped on a function call, you don't jump into it, you just go to the next line of code. I think this shortcut would lead to a terrible user experience (also because at some point you get stucked into the ioloop of tornado and has no choice but set a breakpoint and click on continue to reach the next statement).

This is a technical problem at the kernel level, and should not overflow on the protocol and the UI.

The only solution I see for now is to have a shadow breakpoint (meaning the frontend does not see them). If the option for debugging external code is set to True, when receiving a next request, we first set a breakpoint to the next line, send a continue request, and turned the continue response from debugpy into a next response for the frontend. This also requires to capture the stopped / continued event and modify them accordingly before forwarding them to the frontend.

This is quite easy to setup for a cell, a bit more complex for external files, and even more complex for function call, since you may step into the function (or simply break into it), send next until you reach the end of the function, and then click next again, and you should jumped back to the line after the function call.

This last usecase would require additional complexity to retrieve the call site of the function (thanks to the stacktrace) and set the shadow breakpoint accordingly.

echarles commented 2 years ago

We want the user to be able to step into the stdlib, not to force him to step into the internals of ipykernel.

I am not sure of that. Python stdlib is defined in https://docs.python.org/3/library and does not cover user libraries like numpy or pandas. I will continue using stdlib and user terminology. For ipykernel, tornado... related code, they are not stdlib nor user, so maby system terminology can be used.

It looks like the 'StdLib' option shipped by debugpy enable or disable breakpoints for stdlib, system and user code. We can name those libraries external.

A user willing to set a breakpoint outside of the notebook, so eg in a numpy or foo custom package he is using, will also want to step into that code portion.

I agree that stdlib and system should ideally be excluded.

When you debug, if you click next when stopped on a function call, you don't jump into it, you just go to the next line of code. I think this shortcut would lead to a terrible user experience (also because at some point you get stucked into the ioloop of tornado and has no choice but set a breakpoint and click on continue to reach the next statement).

The terrible user experience is the intended user experience if the user wants to debug some libraries he depends upon. That is the whole point of the work done to show the kernel source tab in jupyterlab.

This is a technical problem at the kernel level, and should not overflow on the protocol and the UI.

If this is possible, this will be ideal.

The only solution I see for now is to have a shadow breakpoint (meaning the frontend does not see them). If the option for debugging external code is set to True, when receiving a next request, we first set a breakpoint to the next line, send a continue request, and turned the continue response from debugpy into a next response for the frontend. This also requires to capture the stopped / continued event and modify them accordingly before forwarding them to the frontend.

This is quite easy to setup for a cell, a bit more complex for external files, and even more complex for function call, since you may step into the function (or simply break into it), send next until you reach the end of the function, and then click next again, and you should jumped back to the line after the function call.

This last usecase would require additional complexity to retrieve the call site of the function (thanks to the stacktrace) and set the shadow breakpoint accordingly.

I roughly see what you mean and sounds terrible to implement. You say If the option for debugging external code is set to True, so we agree there are two ways to initiate the debug session: with and without external code.

I am wondering what the debuggers of the external code will say. Are they ok to step into all external code and just set an additional breakpoint in case they want to progress faster?

JohanMabille commented 2 years ago

The issue is that with the current implementation, as soon as you allow to debug external libraries (whatever they are), you cannot execute a next request when stopped in a cell without getting into ipykernel internals. And this is definitely not what the user wants, see https://github.com/ipython/ipykernel/issues/832 that was opened after the release breaking the original behavior.

I agree that stdlib and system should ideally be excluded.

I think we should have the possibility to jump into them, not totally exclude them from debugging, nor forcing the user to jump into the kernel code as soon as you hit the next button.

I roughly see what you mean and sounds terrible to implement. You say If the option for debugging external code is set to True, so we agree there are two ways to initiate the debug session: with and without external code.

Yes, but ideally the user should not have to bother with that, he should always be able to jump into external code. Giving an option to the user here is admitting that this behavior (being forced to break into ipykernel internal) is annoying but we won't fix it when the user wants to break into external code.

I agree that the solution I proposed is complicated to implement, but this looks like the clean way to do it for me.

echarles commented 2 years ago

Going back to the user experience, I have give here a 4 debugging session configured in a way that the breakpoints are activated in the external sources:

py file with VS-Code
ipynb with VS-Code
py file with JupyterLab
ipynb with VS-Code (read more on the current state of justMyCode in VS Code notebook on https://github.com/microsoft/vscode-jupyter/issues/8146)

The main difference is that the call stack for notebook in VS Code is not populated with the ipkernel sources. Looking at the last session, even if the next step action should be avoided in some case, I can see some value inspecting variable and looping in the external code blocks.

Untitled

Untitled2

Untitled3

Untitled4

echarles commented 2 years ago

I will open a PR to enable the experimental StdLib option based on a flag sent with the initialisation request. This flag could be set with a jupyterlab experimental setting.

With this approach, users will not be impacted. Only early adopters could activate that option and use in a wise way the navigation in the callstack.

We can discuss and plan more robust solution.

ipython / ipykernel

Breaking in StdLib while not stepping into #854