microsoft / vscode-jupyter

VS Code Jupyter extension
https://marketplace.visualstudio.com/items?itemName=ms-toolsai.jupyter
MIT License
1.31k stars 293 forks source link

Display some error message about the fact that some remote kernels are no longer usable to avoid long spinnings of kernels in kernel picker #15911

Closed p-i- closed 3 months ago

p-i- commented 4 months ago

Type: Bug

Kernel-picker icon is spinning

Output -> Python:

:
2024-08-01 22:35:26.028 [warning] could not find a pixi interpreter for the interpreter at /Users/pi/code/2024/spins/.venv/bin/python
:
2024-08-01 22:35:26.856 [info] Starting Pylance language server. <-- last message

^ Maybe this is the issue -- it never starts? I've tried reinstalling Jupyter and Pylance extensions, no luck.

Output -> Jupyter:

:
22:35:26.717 [info] Kernel successfully started
:
22:35:26.862 [debug] Interpreter for Pylance for Notebook URI "~/code/2024/s~ns/x.ipynb" is ~/code/2024/s~ns/.venv/bin/python
22:36:40.949 [debug] Error in detecting whether url is isJupyterHub: FetchError: request to http://10.3.0.1:6666/hub/a~ failed, reason: connect ETIMEDOUT 10.3.0.1:6666
22:37:55.958 [error] Failed to validate Password info D [FetchError]: request to http://10.3.0.1:6666/ failed, reason: connect ETIMEDOUT 10.3.0.1:6666
    at ClientRequest.<anonymous> (/Users/~/.vscode/extensions/ms-toolsai.jupyter-2024.7.0-darwin-arm64/dist/node_modules/node-fetch.js:1:306479)
    at ClientRequest.emit (node:events:531:35)
    at Socket.socketErrorListener (node:_http_client:500:9)
    at Socket.emit (node:events:519:28)
    at emitErrorNT (node:internal/streams/destroy:169:8)
    at emitErrorCloseNT (node:internal/streams/destroy:128:3)
    at processTicksAndRejections (node:internal/process/task_queues:82:21) {
  type: 'system',
  errno: 'ETIMEDOUT',
  code: 'ETIMEDOUT'
}
22:38:25.983 [error] Jupyter Lab Helper cannot enumerate kernelspecs. Returning a default kernel. Specs {}.
22:39:10.969 [error] Error fetching kernels from http://10.3.0.1:6666/ (10.3.0.1): FetchError: request to http://10.3.0.1:6666/a~/kernels?1722548275965 failed, reason: connect ETIMEDOUT 10.3.0.1:6666
    at ClientRequest.<anonymous> (/Users/~/.vscode/extensions/ms-toolsai.jupyter-2024.7.0-darwin-arm64/dist/node_modules/node-fetch.js:1:306479)
    at ClientRequest.emit (node:events:531:35)
    at Socket.socketErrorListener (node:_http_client:500:9)
    at Socket.emit (node:events:519:28)
    at emitErrorNT (node:internal/streams/destroy:169:8)
    at emitErrorCloseNT (node:internal/streams/destroy:128:3)
    at processTicksAndRejections (node:internal/process/task_queues:82:21)
22:39:10.969 [trace] Disposing Jupyter Lab Helper
22:39:10.969 [trace] SessionManager - dispose contents manager
22:39:10.970 [trace] ShutdownSessionAndConnection - dispose session manager
22:39:10.972 [trace] Finished disposing Jupyter Lab Helper
22:40:25.975 [debug] Error in detecting whether url is isJupyterHub: FetchError: request to http://10.3.0.1:6666/hub/a~ failed, reason: connect ETIMEDOUT 10.3.0.1:6666

This all started because I was trying to set a breakpoint on a line of an .ipynb cell in another folder.

fwiw I'm seeing in same above log:

22:35:26.450 [trace] Python A~ env change detected, update => '/Users/~/code/2024/arc2/.venv/bin/python'
22:35:26.019 [trace] Hiding default KernelSpec /python for interpreter ~/code/2024/arc2/.venv/bin/python (KernelSpec file ~/code/2024/arc2/.venv/share/jupyter/kernels/python3/kernel.json)
22:35:26.024 [trace] Hiding default KernelSpec /python for interpreter ~/code/2024/s~ns/.venv/bin/python (KernelSpec file ~/code/2024/s~ns/.venv/share/jupyter/kernels/python3/kernel.json)
22:35:26.028 [trace] Conda file is conda

(hmm I don't have conda)

I haven't been in this arc2/ folder for several vscode restarts now. Could be related?

AH -- at this point, after writing all of the above, it has finally stopped spinning & gives the correct ".venv (Python 3.12.3)".

Here's the tail-end of the Output -> Jupyter:

22:42:11.000 [error] Jupyter Lab Helper cannot enumerate kernelspecs. Returning a default kernel. Specs {}.
22:42:55.987 [error] Error fetching kernels from http://10.3.0.1:6666/ (10.3.0.1): FetchError: request to http://10.3.0.1:6666/a~/kernels?1722548500982 failed, reason: connect ETIMEDOUT 10.3.0.1:6666
    at ClientRequest.<anonymous> (/Users/~/.vscode/extensions/ms-toolsai.jupyter-2024.7.0-darwin-arm64/dist/node_modules/node-fetch.js:1:306479)
    at ClientRequest.emit (node:events:531:35)
    at Socket.socketErrorListener (node:_http_client:500:9)
    at Socket.emit (node:events:519:28)
    at emitErrorNT (node:internal/streams/destroy:169:8)
    at emitErrorCloseNT (node:internal/streams/destroy:128:3)
    at processTicksAndRejections (node:internal/process/task_queues:82:21)
22:42:55.988 [trace] Disposing Jupyter Lab Helper
22:42:55.988 [trace] SessionManager - dispose contents manager
22:42:55.989 [trace] ShutdownSessionAndConnection - dispose session manager
22:42:55.990 [trace] Finished disposing Jupyter Lab Helper
22:42:55.993 [error] UniversalRemoteKernelFinder: Failed to get kernels without cache FetchError: request to http://10.3.0.1:6666/a~/kernels?1722548500982 failed, reason: connect ETIMEDOUT 10.3.0.1:6666
    at ClientRequest.<anonymous> (/Users/~/.vscode/extensions/ms-toolsai.jupyter-2024.7.0-darwin-arm64/dist/node_modules/node-fetch.js:1:306479)
    at ClientRequest.emit (node:events:531:35)
    at Socket.socketErrorListener (node:_http_client:500:9)
    at Socket.emit (node:events:519:28)
    at emitErrorNT (node:internal/streams/destroy:169:8)
    at emitErrorCloseNT (node:internal/streams/destroy:128:3)
    at processTicksAndRejections (node:internal/process/task_queues:82:21)
22:42:55.993 [trace] UniversalRemoteKernelFinder: Writing 0 remote kernel connection metadata to cache
22:42:56.012 [debug] End refreshing Kernel ~cker (1722548125554)

And here's the tail-end of the Output -> Python:

2024-08-01 22:35:26.856 [info] Starting Pylance language server.

(i.e. THAT hasn't changed, so looks like the Pylance was a red-herring)

One thing that still bugs me is:

2024-08-01 22:35:26.028 [warning] could not find a pixi interpreter for the interpreter at /Users/pi/code/2024/spins/.venv/bin/python

(... in the Output -> Python) I don't know what this pixi thing is, and the message doesn't easily google. First hit goes to an issue that's an AB problem (user can't get Hello World working)

TLDR:

22:35:25.554 [debug] Start refreshing Kernel ~cker (1722548125554)
22:42:56.012 [debug] End refreshing Kernel ~cker (1722548125554)

7 mins. :/

Please fix this. I've lost 2 hours restarting, reinstalling, digging thru logs, composing this.

fwiw I was able to execute a cell & get output before the picker stopped spinning.

Extension version: 2024.7.0 VS Code version: Code 1.92.0 (b1c0a14de1414fcdaa400695b4db1c0799bc3124, 2024-07-31T23:26:45.634Z) OS version: Darwin arm64 23.5.0 Modes:

System Info |Item|Value| |---|---| |CPUs|Apple M2 (8 x 2400)| |GPU Status|2d_canvas: enabled
canvas_oop_rasterization: enabled_on
direct_rendering_display_compositor: disabled_off_ok
gpu_compositing: enabled
multiple_raster_threads: enabled_on
opengl: enabled_on
rasterization: enabled
raw_draw: disabled_off_ok
skia_graphite: disabled_off
video_decode: enabled
video_encode: enabled
webgl: enabled
webgl2: enabled
webgpu: enabled
webnn: disabled_off| |Load (avg)|2, 2, 2| |Memory (System)|24.00GB (0.64GB free)| |Process Argv|--crash-reporter-id f10d97cd-2115-4dba-a34a-07be9312995a| |Screen Reader|no| |VM|0%|
A/B Experiments ``` vsliv368cf:30146710 vspor879:30202332 vspor708:30202333 vspor363:30204092 vstes627:30244334 vscorecescf:30445987 vscod805cf:30301675 binariesv615:30325510 vsaa593cf:30376535 py29gd2263:31024239 vscaac:30438847 c4g48928:30535728 azure-dev_surveyone:30548225 2i9eh265:30646982 962ge761:30959799 pythongtdpath:30769146 welcomedialog:30910333 pythonnoceb:30805159 asynctok:30898717 pythonregdiag2:30936856 pythonmypyd1:30879173 2e7ec940:31000449 pythontbext0:30879054 accentitlementst:30995554 dsvsc016:30899300 dsvsc017:30899301 dsvsc018:30899302 cppperfnew:31000557 dsvsc020:30976470 pythonait:31006305 dsvsc021:30996838 jg8ic977:31013176 pythoncenvpt:31062603 a69g1124:31058053 dvdeprecation:31068756 dwnewjupytercf:31046870 impr_priority:31102340 refactort:31108082 ccplc:31103425 pythonrstrctxt:31103193 wkspc-onlycs-c:31106320 wkspc-ranged-c:31107834 ```
DonJayamanne commented 4 months ago

Sorry you are running into this and thanks for filing this issue I can see a few issues here

Lets assume the kernel spinner is a non issue. i.e. lets assume this is always spinning, and you can ignore this. But the problem you are still running into is the fact that the correct .venv takes a while to appear, is that correct. I'd like to separate the two, ie. the spinning icon could still appear whilst other kernels are being discovered, but in your case you want to see .venv immediately.

Please can you confirm this.

Finally, please can you share the full logs, I'd like to see exactly whats going on and see whats slow. Thanks for this, however it doesn't show where the time was spent.

22:35:25.554 [debug] Start refreshing Kernel ~cker (1722548125554)
22:42:56.012 [debug] End refreshing Kernel ~cker (1722548125554)

Finally, please enable trace logging before sharing the logs again (again, please share the full logs) Please could you enable logging as follows:

DonJayamanne commented 4 months ago

I haven't been in this arc2/ folder for several vscode restarts now. Could be related?

Most likely due to the fact that you have some remote jupyter servers that are no longer available. However there are two issues here

  1. Spinner is visible for a long time
  2. .venv takes a while to appear (this could be related to some delay in Python extension, lets first have a look at the Jupyter logs)
p-i- commented 4 months ago

@DonJayamanne Thankyou for your attention.

Just hover over the item and then click the x icon (see images)

I followed this instruction; there was indeed one item, which I deleted. I forgot to take a screenshot first, but I'm pretty sure it started with 10.3.... and was the http://10.3.0.1:6666/ visible in my prior log dump.

Restarting VSCode the issue is gone.

Unfortunately this does mean I can't collect logs that would indicate what is chewing up at the time. The previous log except that I posted WERE already at trace level, however. Looks to me like VSCode is repeatedly timing out trying to connect to this rogue kernel.

Why do I even have such a rogue kernel? Last year I did some contract work in involving connecting to a client's remote machine, but I've been using VSCode ONLY for my own local Python projects this year.

Also, why did this problem just suddenly spring up today?

Yesterday I played with https://github.com/ConGustoAI/friendlly -- I've sent this thread to the author in case he can shine some light on this.

So, I'm now back to the original problem I was trying to fix before this issue: that if I create a fresh notebook and set a breakpoint and debug-run the cell, it appears to hang; it should reach the breakpoint and highlight that line red, but doesn't. I will file that as a separate issue.

One takeaway I can see from this is that it would be very nice if VSCode was more transparent about what's going on. I had to file an issue and follow an instruction to remove this timing-out kernel. it would have been nice to be able to see this information upon clicking the spinning kernel-picker icon. Maybe see a dropdown of kernels with a tick next to each item that has completed, and a spinner next to each item that hasn't.

Then I would naturally have observed that the currently selected kernel was FINE, but a rogue one was timing out, and would have investigated/removed the rogue one.

What actually happened is that I was experiencing a DIFFERENT problem (debugger-fail) together with THIS one (spinning), and looking at the logs saw TWO more potentially related symptoms:

1.
2024-08-01 22:35:26.028 [warning] could not find a pixi interpreter for the interpreter at /Users/pi/code/2024/spins/.venv/bin/python

2.
2024-08-01 22:35:26.856 [info] Starting Pylance language server. <-- last message

(for 2., I think it should either say "Pylance language server STARTED" or there should be a subsequent message to that effect, otherwise it looks like the thing is hanging: I remember there were previous messages for another process that DID have separate messages for "Starting..." and "started!".

Just as a thought, wouldn't it help to have log messages hyperlink to knowledge-base articles? Some of your extensions have 5M+ downloads, which reflects a huge user base. I would imagine such a policy would significantly reduce the load on staffing issue tickets.

p-i- commented 4 months ago

fwiw This is the problem I was actually trying to solve: https://github.com/microsoft/vscode-jupyter/issues/15914

DonJayamanne commented 3 months ago

Why do I even have such a rogue kernel? Last year I did some contract work in involving connecting to a client's remote machine, but I've been using VSCode ONLY for my own local Python projects this year.

Remote Jupyter servers used at any point are always linked to your user profile. Possible you never enabled symcing of profiles and you logged into your githu account or the like in VS Code just recently.

the original problem I was trying to fix before this issue: that if I create a fresh notebook and set a breakpoint and

Lets close in favour of #15914

] could not find a pixi interpreter for the interpreter at /Users/pi/code/2024/

This is caused by Python extension, see here https://github.com/microsoft/vscode-python/issues/23901

p-i- commented 3 months ago

Wait, this is a completely separate issue AFAICS.

I fixed the spinning kernel-picker--icon by removing a rogue kernel.

I think this ticket is still actionable -- VSCode should indicate which kernel(s) are causing the spinning kernel-picker icon when you click on it, no?

i.e. I had to be told where to look to find this rogue kernel, which means many other users may experience the same frustration point.

The issue you've closed this in favour of is nothing to do with this issue. I know this because AFTER deleting the rogue kernel I restarted VSCode in a fresh folder, and STILL got the debugger-fail. That one's related to the non-default setting of JustMyCode to off (since restoring to default value fixes it).

PS Maybe remove the info-needed tag?

DonJayamanne commented 3 months ago

i.e. I had to be told where to look to find this rogue kernel, which means many other users may experience the same frustration point.

The assumption is that if you have any remote kernels that you are no longer using then they'd be removed. I can leave this issue open as a feature request.