microsoft / vscode-jupyter

VS Code Jupyter extension
https://marketplace.visualstudio.com/items?itemName=ms-toolsai.jupyter
MIT License
1.3k stars 293 forks source link

Jupyter for vscode continues to be slow (for large notebooks with mardown cells & large outputs) #14459

Closed loftusa closed 7 months ago

loftusa commented 1 year ago

Every few months I try to use vscode for jupyter because I would really love to just use vscode for everything. Every few months, I am disappointed and switch back to the web version.

There are two reasons for this:

1) Jupyter for vscode continues, stubbornly, to essentially always be more slow than traditional jupyter lab on localhost. Look at the run times in this screenshot. It took me a minute to run imports; when I ran the exact same code on the localhost version, it took 7.7 seconds (pictures attached). This is an extremely consistent theme in vscode jupyter. Cells will sometimes randomly take minutes to run, and will sometimes not even run at all until you press 'shift-enter' on them twice. This has been true for me across multiple computers, in many different dev environments.

Screenshot 2023-10-06 at 6 04 32 PM Screenshot 2023-10-06 at 6 13 02 PM

Cells also just randomly take forever to run, for god knows what reason. Here is a screenshot of assigning a string to a variable taking 27.4 seconds:

Screenshot 2023-10-06 at 6 44 34 PM 1

Note that I am not trying to blame the team here, I am just frustrated because this is so close to being a great product, but this one thing holds it back, and it keeps not being fixed for years on end. The very first thing I would do as a product manager if I were in charge of vscode-jupyter is to pause all current tasks and plan, with the team, a multiple-month effort to speed things up, and get cells to run effectively instantly (or as close to the amount of time the python processing of the code takes as possible), every time.

2) Jupyter for vscode sucks at inline documentation, the equivalent of shift+tab in vscode jupyter. I am aware of the existence of the trigger parameter hints and show hover settings in the keyboard shortcuts. These are extremely unreliable, and actually show documentation when I press the button maybe 1/5 of the time. When they do show documentation, there is a 'loading' tag for awhile. Browser jupyter, on the other hand, is immediate with this. Basically every time. Below is an example.

image

The other issue with inline documentation is that, as far as I can tell, hover documentation for methods on instantiated variables simply doesn't work. When I am using pandas, for instance, typing df.unique( and then pressing the show hover hotkey while my typing carat is to the right of the parenthesis pops up a documentation window saying exactly nothing. In contrast, in the web version, typing the same thing produces full documentation, as expected.

I don't understand how these two issues aren't your guys's top priority. Everyone I've spoken to who uses jupyter has had exactly the same experience as I have, and everyone I've spoken to who uses jupyter uses the web version exclusively for exactly these issues. Even Kaggle notebooks are better. I love copilot and it'd be great to bring it into my jupyter notebook experience, but it has just never been viable to switch if I don't want a workflow where I have to wait for 30 seconds every time I press command-enter, or I am frustratingly making a new cell above the current one and typing function? just to see documentation.

These issues have been ongoing since vscode jupyter started. They are the only things holding me and everyone else I've spoken to back from using it. Without fixing these issues, the whole thing is unusable, and no other features you guys put in matter. Why are you guys working on anything besides this when they are the only things anyone I know cares about?

I should note that this is all running in a docker container with access to 7 of my 8 cpus and 10gb of RAM. I am on a 2022 macbook air. I realize that this is a rant, so thank you for reading it. Nothing personal, I just think this product has a bunch of potential and I hate to see it unusable for so long.

joelostblom commented 8 months ago

I'm running into this as well and it is preventing me from continue working in VS Code. Thank you for trying to get to the bottom of the issue. In the meanwhile, is there any setting to toggle as a workaround to turn off the backup? I can only find "Autosave" which is already turned off. In JupyterLab I don't notice any slow down at all for the same notebook.

joelostblom commented 7 months ago

Something else suggested that it is indeed the backup taking time is that if I try to exist VS Code while a large slow notebook is open, I see this:

image

DonJayamanne commented 7 months ago

@loftusa @dschaub95 @joelostblom @LaBlazer @martinprad0 @hylkedonker @gdebrun2 @ale-dg @FlorinAndrei @JasonGross @jhancibo @yuuuxt @suiluj @sgaseretto @Animadversio

We have made a perf improvements and I can see some significant improvements, however the scenario only applies to execution. Please can you

Please test this out and let me know how it goes. Note:

If you still run into perf issues with the latest pre-release and VS Code insiders, please share information on where you are experienceing the delays. Thank you for your patience and help in trying to get this addressed.

joelostblom commented 7 months ago

Exciting! Thank you for working on this @DonJayamanne !

Unfortunately, after installing insiders 1.89 (the update of today) and switching to the pre-release of the two extensions I seem to be running into an error of not being able to open or create new notebooks:

image

Version: 1.89.0-insider
Commit: 903ce35c77c7b5714a2d9063b9f1d9bc2956d07c
Date: 2024-04-11T05:50:57.416Z
Electron: 28.2.8
ElectronBuildId: 27744544
Chromium: 120.0.6099.291
Node.js: 18.18.2
V8: 12.0.267.19-electron.0
OS: Linux x64 6.8.0-76060800daily20240311-generic

Here are all the installed extensions and their version numbers:

image

ale-dg commented 7 months ago

@DonJayamanne just so I don't mess it up... regarding Jupyter, is it necessary to have ALL the Jupyter extensions or only the Jupyter and Renderers extensions? Can we test with any other extensions for theming? And regarding the settings, do we leave them "as is" (with the exception of the auto save) or do we disable anything else (i.e. auto completions from Jupyter)?

Thanks

Best

amunger commented 7 months ago

@joelostblom - does that error persist if you close all tabs and reload? Can you share the output from the console in Developer: toggle developer tools

rabyj commented 7 months ago

Just want to note that the slowness problem seems to have gotten worse since August 2023, as stated in a duplicate. Just to consolidate accumulated information.

There is a temporary fix also mentioned in that issue, being downgrading version.

The only solution for not having lag is to downgrade Jupyter to the August 2023 release (somehow it doesn't hangs and the kernel rarely dies), and the most stable solution is to definitely downgrade everything to August, which makes everything go smoother for working.

The duplicate also has the notebook-intellisense and notebook-kernel tags, it might be pertinent to add them here too.

ale-dg commented 7 months ago

Just want to note that the slowness problem seems to have gotten worse since August 2023, as stated in a duplicate. Just to consolidate accumulated information.

There is a temporary fix also mentioned in that issue, being downgrading version.

The only solution for not having lag is to downgrade Jupyter to the August 2023 release (somehow it doesn't hangs and the kernel rarely dies), and the most stable solution is to definitely downgrade everything to August, which makes everything go smoother for working.

The duplicate also has the notebook-intellisense and notebook-kernel tags, it might be pertinent to add them here too.

Yeah... that was a very lengthy discussion among the contributors of Pylance, Jupyter and myself, and indeed it gets better. But it's not like a fix though. It's a workaround should Jupyter get stubborn on not working. Although you could only downgrade both Pylance and Jupyter (Jupyter itself, Renderers and Pylance) to the August releases and it works as well. It also helps if you remove the other packages of Jupyter.

DonJayamanne commented 7 months ago

@rabyj @ale-dg I've re-opened the issue, lets continue discussion there as you seem to be running into issues with completions

ale-dg commented 7 months ago

@rabyj @ale-dg I've re-opened the issue, lets continue discussion there as you seem to be running into issues with completions

I haven't got a chance to try the new solution... I was just giving a bit of feedback on the other one.

ale-dg commented 7 months ago

Hi @DonJayamanne , @amunger

I have just tested with the largest notebook I have which includes a lot of markdowns and it indeed runs faster. Although some functions still take time to execute, but I guess it's just native to the libraries (it would be necessary for someone else to confirm - sns.regplot and clustering functions). Also I was monitoring the use of the CPU in the MacOS' Activity Monitor and I noticed it now barely goes above 500 MB.

On regards of the issue with the completions, I guess these were solved as there was no lag nor any problems with them after running all the cells (around 180 coding cells alone), when before it would began to stuck after 70 or so.

The only caveat I would add is it was done only with these packages active and no changes to the settings.json:

Extension Author (truncated) Version
python ms- 2024.5.11021008
vscode-pylance ms- 2024.4.101
jupyter ms- 2024.4.2024041101
jupyter-renderers ms- 1.0.17

So, my suggestion would be to just begin to add our normal extensions just to see if any of those would choke the improvement since most likely we all work different ones. In my normal VsCode I run with these and several changes to the settings.json (for font, font size, ligatures, colours, conda path, semantic highlight, tree views, etc):

Extension Author (truncated) Version
catppuccin-vsc Cat 3.13.0
catppuccin-vsc-icons Cat 1.11.0
catppuccin-vsc-pack cat 1.0.2
vscode-pull-request-github Git 0.86.1
rainbow-csv mec 3.11.0
black-formatter ms- 2024.2.0
debugpy ms- 2024.4.0
python ms- 2024.4.1
vscode-pylance ms- 2024.4.101
jupyter ms- 2024.3.1
jupyter-renderers ms- 1.0.17
sqltools mtx 0.28.1
sqltools-driver-pg mtx 0.5.2
material-icon-theme PKi 4.34.0

(2 theme extensions excluded)

Best

ale-dg commented 7 months ago

Hi,

Just for checking (and to actually simulate a normal working scenario), I have loaded all the extensions (except for the sql tools) and the lag in typing began to happen after 156 code cells. Check the video below. The execution time was the same, but after that this lag happens.

@DonJayamanne I have decided to post it here instead of the other issue to see if someone else runs into the same issue, since originally you had closed it to merge it with this.

Best,

https://github.com/microsoft/vscode-jupyter/assets/106413328/9a881459-56ad-47c8-aea3-0cc5b070bb63

amunger commented 7 months ago

how large is the notebook file. One thing I've added is the ability to use the optimized save operation that was introduced for remote. You can try it out with the settings:

"files.autoSave": "afterDelay" "notebook.experimental.remoteSave": true

With those, saves should not happen on the renderer process, and backups shouldn't occur because auto-save is enabled (for saved files).

ale-dg commented 7 months ago

how large is the notebook file. One thing I've added is the ability to use the optimized save operation that was introduced for remote. You can try it out with the settings:

"files.autoSave": "afterDelay" "notebook.experimental.remoteSave": true

With those, saves should not happen on the renderer process, and backups shouldn't occur because auto-save is enabled (for saved files).

It's 33.9 MB. I am testing now without the theming... I am guessing the issue is there since I have noticed it takes its time to load the colours

Best

ale-dg commented 7 months ago

Well... apparently my hypothesis was true... it goes faster, not THAT much, but now as you can see in the video below it reacts much faster even though I made a typo

Best

https://github.com/microsoft/vscode-jupyter/assets/106413328/34636747-176e-40a8-8ff7-573c32d61890

ale-dg commented 7 months ago

@amunger I have tried with the settings you mentioned above and it happens with the same lag as the video above. So, there is SOME lag, although not as the first video I shared.

Best

DonJayamanne commented 7 months ago

@ale-dg pleaser can you share the 2 CPU profiles , instructions on the other issue Thanks

ale-dg commented 7 months ago

@ale-dg pleaser can you share the 2 CPU profiles , instructions on the other issue Thanks

I am getting this issue when trying to save it:

Failed to save timeline: The request is not allowed by the user agent or the platform in the current context. (NotAllowedError)  EDIT: As always... I was doing something wrong... give me a moment and I'll get back to you

joelostblom commented 7 months ago

@joelostblom - does that error persist if you close all tabs and reload? Can you share the output from the console in Developer: toggle developer tools

Thanks @amunger ! I thought I had restarted VS Code but it turns out there was a window open on another desktop and closing that fixed it. So far I'm noticing much better performance on notebooks with large interactive charts (Altair/Vega charts), thanks for all your work on this issue! I will report back when I test it more with larger and longer-running notebooks if I run into issue.

ale-dg commented 7 months ago

@ale-dg pleaser can you share the 2 CPU profiles , instructions on the other issue Thanks

@DonJayamanne attached the profiles. Thank you and all the VsCode team for all the support and hard work!

Best

P.S. It was done WITH ALL the extensions installed (as in the first video I shared), except with the sql tools.

Archive.zip

ale-dg commented 7 months ago

@amunger @DonJayamanne @rebornix regarding the Plotly graphs, when working with a notebook and executing several times the cells (i.e. when changing colours or adjusting elements or so), the graphs begin to render REALLY slow and, sometimes, they do not render at all. Also the notebooks keep being really large. See attached, both for seeing them not rendering and the size of the file once decompressed (around 94MB).

Yet again, this is done in the Insiders' version with the latest updates as of yesterday.

Best

spotify.ipynb.zip

ale-dg commented 7 months ago

@amunger, @DonJayamanne, @rebornix

Keeping up with the Plotly graphs within VSCode Insiders and all the pre-releases, I did a Kaggle notebook with a lot of them (you can find it here - an upvote is really appreciated, if possible). When downloading the code, its size is only 52MB, and the same code in VSCode (with more or less the same imports) has a size of 94.1 MB. At the end of the post I attach both notebooks.

Additionally to the above, when running the entire notebook (meaning clicking "Run All") with the Plotly graphs in interactive mode (where you can hover over the data and so), the graphs do not render, although the code runs amazingly fast. It is necessary to execute cell by cell to have the graphs rendered properly (although some of them throw an error). This didn't happen on Kaggle, where I executed all the cells and the graphs rendered normally.

Also something very strange is that if I try to open the Kaggle's notebook in VSCode, I get this error:

Screenshot 2024-04-14 at 17 04 26

My apologies in advance for flooding the comments over the weekend 😅

Best

kaggle_code.ipynb.zip

vscode_code.ipynb.zip

amunger commented 7 months ago

Thanks for the extra info and repro notebooks @ale-dg, that sounds like something different than what I'm trying to solve here, so I'll split it out into another issue.

ale-dg commented 7 months ago

Thanks for the extra info and repro notebooks @ale-dg, that sounds like something different than what I'm trying to solve here, so I'll split it out into another issue.

@amunger no problem. Thanks for checking it.

On regards of the large notebooks (and the auto-completions), I've found these issues raised in the Pylance repro which I think might be related to the problems, should you want to give them a look: https://github.com/microsoft/vscode/issues/210528

yuuuxt commented 7 months ago

No one has mentioned rg.exe issue yet, but for me seems this time it's related to a background rg.exe task (in Windows , open Task Manager, select vscode and expand to find). details in #15572

UPDATE - seems unrelated.

bhvieira commented 7 months ago

This issue is making VSCode with Jupyter basically unworkable for me. It used to not be like this however, wonder when it changed.

ale-dg commented 7 months ago

Try the insiders' version. It has been working smoothly for me the last days

bhvieira commented 7 months ago

Thanks for the tip @ale-dg . It helps a little, but does not solve the issue. I still routinely have to wait 5+ seconds in some cells before execution. This gets quite annoying when doing data analysis, which requires frequent cell execution

During these hiccups I also can't save the file, don't get Copilot suggestions or code completion and can't auto format code.

DonJayamanne commented 7 months ago

@bhvieira

Please can you replicate the issue and share the 3 CPU profiles. My suggestion would be to capture the 3 CPU profiles individually (reloading VS Code between each capture while replicating the issue).

It was with these logs that was provided by ale-dg that we were able to identify and resolve some of the perf issues.

ale-dg commented 7 months ago

Thanks for the tip @ale-dg . It helps a little, but does not solve the issue. I still routinely have to wait 5+ seconds in some cells before execution. This gets quite annoying when doing data analysis, which requires frequent cell execution

During these hiccups I also can't save the file, don't get Copilot suggestions or code completion and can't auto format code.

I can't help you with Copilot because I don't use it. But for the rest try these settings in your settings.json, as they have helped me a bit (at least in both VSCode Insiders and Release)

"files.autosave": "off",
"notebook.experimental.remoteSave": true,
"notebook.formatOnSave.enabled": true,
"notebook.formatOnCellExecution": false,
"[python]": {
        "editor.formatOnSave": true
},
"python.languageServer": "Pylance",

It won't format automatically when you execute the cell, but it will indeed do it when you save it. Also it won¿t be saving the notebook automatically and you will have to do so automatically, but when you do it, it will format it. I have noticed it does give a bit of an edge in performance doing it this way.

I am not sure to which hiccups you are referring to, like a delay in the inputs as in this video https://github.com/microsoft/vscode-jupyter/issues/14459#issuecomment-2050850128? Because if it is that, I can tell you it has been improving in Insiders. Also I just loaded a 2.5 GB file into a notebook and it took 40s to load, which I don't think it's that bad... I haven't done any graphs or so, but before it took like 10 minutes just for loading.

Hope this helps, otherwise you'll need to do the profiles so @DonJayamanne can track what is happening with the issue you are presenting.

Best

DonJayamanne commented 7 months ago

Also I just loaded a 2.5 GB file into a notebook and it took 40s to load, which I don't think it's that bad... I haven't done any graphs or so, but before it took like 10 minutes just for loading.

@ale-dg what did this notebook have that made is so large. Where there some image outputs. I would like to ensure we test with such large notebooks, but would like to get the content right. I.e. ensure we have the same types of outputs as you have (to have a more realistic dataset).

As always, thank you.

ale-dg commented 7 months ago

@ale-dg what did this notebook have that made is so large. Where there some image outputs. I would like to ensure we test with such large notebooks, but would like to get the content right. I.e. ensure we have the same types of outputs as you have (to have a more realistic dataset).

As always, thank you.

@DonJayamanne what I meant is that I loaded a csv of 2.5 GB into a notebook (or a Pandas data frame If you'd like). This file is so large because it has over 25 million lines and 10 columns, so a little over 250 million data points. I haven't made anything yet with the data, but so far the notebook is 80 kb (not sure why....)

Best

bhvieira commented 7 months ago

@bhvieira

* Please profile the  extension host when you run into the perf issue, see here https://github.com/microsoft/vscode/wiki/Performance-Issues#profile-the-running-extensions

* Similarly please profile the vscode renderer process as described here https://github.com/microsoft/vscode/wiki/Performance-Issues#profiling-the-renderer-process

* Finally please profile this as well https://github.com/microsoft/vscode/wiki/Performance-Issues#visual-studio-code-is-sluggish

Please can you replicate the issue and share the 3 CPU profiles. My suggestion would be to capture the 3 CPU profiles individually (reloading VS Code between each capture while replicating the issue).

It was with these logs that was provided by ale-dg that we were able to identify and resolve some of the perf issues.

I can try it when I have some time. In the meantime, I rolled back to v2024.2.0 and the issues disappeared.

DonJayamanne commented 7 months ago

Thank you, please do share the cpu logs, that will help us identify the issues

thinkmachine2023 commented 7 months ago

I cleaned up my VSCode extensions, reducing them from 70s to around 30s, and the VSCode Jupyter Notebook has regained its previous speed, with no more delays.

tlkaufmann commented 7 months ago

Hey all,

I have the same issues (i.e. Jupyter becomes super sluggish after some time). I am currently using the most recent Insider version (1.89.0-insider) with the following extensions: Jupyter (v2024.4.2024042202), Pylance (v2024.4.1) and Python (v2024.4.1)

Attached are the CPU logs, all I am doing is executing a single cell but VSCode takes ~2 seconds between me pressing the button and the cell starting to execute. Trace-20240423T182106.json CPU-20240423T161247.457Z.cpuprofile.txt

As for the renderer process I couldn't figure out how to create the log as my dropdown menu did not have the point "JavaScript Profiler". See screenshot below Screenshot 2024-04-23 at 18 29 24

Thanks for working on this issue @DonJayamanne, I appreciate any help :)

amunger commented 7 months ago

@tlkaufmann - I don't think the extensions profile worked, it should save as a .cpuprofile.txt like the other one did.

That CPU profile does have some good hints though with all those PromiseRejectHanlders, and I actually see a similar stack trace with a large notebook that I'm testing (though not nearly as many as you), so hopefully I can repro the same thing while debugging to find where these are coming from.

[Extension Host] Error: timeout
    at Timeout.p (c:\Users\aamunger\.vscode-insiders\extensions\ms-toolsai.jupyter-2024.3.1\dist\extension.node.js:55:1754)
    at listOnTimeout (node:internal/timers:569:17)
    at processTimers (node:internal/timers:512:7) 

image

DonJayamanne commented 7 months ago

@tlkaufmann the Performance monitor is the right option, please use that menu option. Eager to get my hands on the logs.

tlkaufmann commented 7 months ago

@DonJayamanne and @amunger , thanks a ton for your help! I repeated the processes. Hope this time it worked.

Here's the output from Show Running Extensions: CPU-20240424T100815.117Z.cpuprofile.txt

And heres' the output from the Performance monitor: Trace-20240424T121235.json @amunger, I wouldn't know how to export this as a .cpuprofile.txt tbh, the only option I see is the save profile button which creates the attached json file (see screenshot below).

Screenshot 2024-04-24 at 12 14 55

DonJayamanne commented 7 months ago

@tlkaufmann The second json is empty. Also what extensions do you have installed? There are two methods getNeighborFiles & detectCellLanguage that gets invoked and I'm not sure what extensions these are coming from. Please can you share the list of the extensions you have installed.

If saving is an issue, then please go to the bottom of the profile view and select the Bottom Up tab and sort the list by Self Timeand Total Time as below and send the screen shots.

Thank you for your patience and help,

I'd like to see the top items in the sorted list along with the names and file paths. Screenshot 2024-04-24 at 21 24 40

maximilianmordig commented 7 months ago

I am also experiencing very bad performance with Jupyter networks with the last release. Whenever I execute a cell such as 1+1, it lags. If I execute the cell again without changing the content, it is blazing fast as it should be.

I have attached the two logs. Trace-20240424T134810.json CPU-20240424T113715.203Z.cpuprofile

Actually, the performance file is empty because the javascript fails with Failed to save timeline: The request is not allowed by the user agent or the platform in the current context. (NotAllowedError). In the meantime, I attach a screenshot, so it seems to be related to the Animation. Screenshot 2024-04-24 at 13 53 26 image

tlkaufmann commented 7 months ago

@DonJayamanne

Somehow I can't save the performance profiles. The error I get is Failed to save timeline: The request is not allowed by the user agent or the platform in the current context. (NotAllowedError), similar to what @maximilianmordig and @ale-dg reported. So here are the screenshots:

I zoomed in right at the start, when I execute the cell. Then there's a delay for like 2 seconds and it gets executed.

Screenshot 2024-04-24 at 14 00 31 Screenshot 2024-04-24 at 14 00 54

Also here are all the active extensions (note that the problem persists also with Copilot disabled): Screenshot 2024-04-24 at 14 03 42

tlkaufmann commented 7 months ago

I have also noticed an interesting pattern. As @maximilianmordig states, a cell that has previously been run, does not experience the delay but newly created cells do.

Now interestingly, if I copy and paste any code into a new cell and run it, it also does not experience the delay but if I type out the same code and execute right away, the delay comes back. Now finally, if I type out the code, then wait 2 seconds and run the cell, there is also no delay.

I don't know what to make of this but I hope it's somehow helpful.

ale-dg commented 7 months ago

@tlkaufmann if I remember correctly, you need to activate the option to save the profiles from Java scripts from the Options tabs (the cog you see there in the screenshot).

Also, I am curious about what is happening because yesterday I was working with a heavy file and doing some Plotly graphs and I didn't have any issue. Maybe if you could share what you were doing it could give an idea of what was happening.

@DonJayamanne @amunger the most common thing I'm noticing is Copilot. Would it be a thing to consider?

Best

amunger commented 7 months ago

@DonJayamanne those getNeighborFiles & detectCellLanguage calls are from copilot This souds like pretty similar behavior to what I was seeing with copilot trying to gather all that context from a large notebook https://github.com/microsoft/vscode/issues/211154

@rebornix was looking into reducing those calls at a certain point

DonJayamanne commented 7 months ago

@tlkaufmann Please can you disable all extensions except Jupyter, Python and Notebook Renderers and test the perfoamnce issues once again. Also, please test against VS Code insiders and the latest pre-release versions of Jupyter and Python extensions. Let me know if that makes a difference, looking at the perf logs it should be better in VS Code insiders without other extensinos. I.e. the delays you are seeing are caused by other extension (we will look into copilot separately).

maximilianmordig commented 7 months ago

Since I disabled completions for copilot, it is fast (at least for now).

rebornix commented 7 months ago

@maximilianmordig to double check, you disabled the copilot completions via github.copilot.editor.enableAutoCompletions, right?

ydastro commented 7 months ago

I disabled the copilot and Pylance by just uninstalling the add-in. And I also find that it is fast now.

DonJayamanne commented 7 months ago

@ydastro please can you reinstall pylance and see if that still makes things slow