microsoft / vscode

Visual Studio Code
https://code.visualstudio.com
MIT License
164.86k stars 29.51k forks source link

Zombie extension hosts running at 100% #194477

Closed jrieken closed 1 year ago

jrieken commented 1 year ago
Screenshot 2023-09-29 at 13 31 34
deepak1556 commented 1 year ago

Can you take a process sample from the OS activity monitor, I hope the native stack can give a clue to what is happening.

jrieken commented 1 year ago

Just happening. This is the command line of such an EH

99701   ??  R    299:41.47 /Applications/Visual Studio Code - Insiders.app/Contents/Frameworks/Code - Insiders Helper (Plugin).app/Contents/MacOS/Code - Insiders Helper (Plugin) --type=utility --utility-sub-type=node.mojom.NodeService --lang=en-GB --service-sandbox-type=none --dns-result-order=ipv4first --inspect-port=0 --user-data-dir=/Users/jrieken/Library/Application Support/Code - Insiders --standard-schemes=vscode-webview,vscode-file --enable-sandbox --secure-schemes=vscode-webview,vscode-file --bypasscsp-schemes --cors-schemes=vscode-webview,vscode-file --fetch-schemes=vscode-webview,vscode-file --service-worker-schemes=vscode-webview --streaming-schemes --shared-files --field-trial-handle=1718379636,r,8003835085444679389,18150943907601750289,262144 --disable-features=CalculateNativeWinOcclusion,SpareRendererForSitePerProcess
99704   ??  S      0:04.19 /Applications/Visual Studio Code - Insiders.app/Contents/Frameworks/Code - Insiders Helper (Plugin).app/Contents/MacOS/Code - Insiders Helper (Plugin) --ms-enable-electron-run-as-node /Users/jrieken/.vscode-insiders/extensions/kisstkondoros.vscode-codemetrics-1.26.1/dist/server.js --node-ipc --clientProcessId=99701
99706   ??  S      0:03.39 /Applications/Visual Studio Code - Insiders.app/Contents/Frameworks/Code - Insiders Helper (Plugin).app/Contents/MacOS/Code - Insiders Helper (Plugin) --ms-enable-electron-run-as-node /Users/jrieken/.vscode-insiders/extensions/kisstkondoros.vscode-gutter-preview-0.30.0/dist/server.js --node-ipc --clientProcessId=99701
99750   ??  S      0:01.17 /Applications/Visual Studio Code - Insiders.app/Contents/Frameworks/Code - Insiders Helper (Plugin).app/Contents/MacOS/Code - Insiders Helper (Plugin) --ms-enable-electron-run-as-node /Users/jrieken/.vscode-insiders/extensions/ms-vscode.anycode-0.0.72/dist/anycode.server.node.js --node-ipc --clientProcessId=99701
16791 s034  S+     0:00.00 grep 99701
jrieken commented 1 year ago

@deepak1556 This the process sample EH-spin-sample.txt

jrieken commented 1 year ago

@connor4312 Not sure about the --inspect-port=0 part. Does this mean we run the EH under development or not?

connor4312 commented 1 year ago

It looks like we set that for all local EH process launches. Just having inspect-port doesn't put the process in debug mode.

What you can do is use the "Debug: Attach to Node.js Process" command and select that process. This will fail, since the debugger won't then know what port it's listening on (if --inspect-port isn't given in the args, it always listens on 9229, and js-debug assumes that is what happens). But, then you can find it. AFter doing so you could grab a performance profile using the edge/chrome devtools on edge://inspect, or by attaching to that using an attach-type launch config VS Code and using the "take performance profile" command -- if the spin sample you recorded above isn't enough for Deepak to figure out what's going on.

jrieken commented 1 year ago

What you can do is use the "Debug: Attach to Node.js Process" command and select that process.

Does that use SIGUSR1? Asking cos I have tried that first and sending the signal kills the process. From the crash dumps we do know that the process already was shutting down

deepak1556 commented 1 year ago

Attaching the symbolicated sample EH-spin-sample-symbolicated.txt

Going from the top, electron::NodeService::~NodeService() indicates the EH process is already in the shutdown phase and has started to destroy the Node.js environment node::FreeEnvironment(node::Environment*) . At this point, connecting via a debugger is not possible since the agent host will stop receiving connections due to the following check, so we cannot get perf profile from these processes.

Moving further down most of the sample time is spent in node::NodePlatform::DrainTasks which happens before the environment gets deleted, this flushes the pending tasks in the libuv event loop but one thing to note is that entering JS via these tasks is not allowed due to the scope. But there are GC tasks being executed v8::internal::FinalizationRegistryCleanupTask::RunInternal and v8::internal::Heap::CollectGarbage which can enter JS. I think the error calls node::errors::TriggerUncaughtException are from these callsites eventually leading to node::OnFatalError. At this point, I expect the process to abort and the main process should be able to reap the utility process but that is not happening.

There is another data point looking back at the screenshot from https://github.com/microsoft/vscode/issues/194477#issue-1919108667, the zombie EH processes are titled utility-process and not extensionHost in the process explorer. We resolve the pids to respective titles based on a map maintained in the main process, the process explorer gets it periodically via the following ipc and the utility process related values are populated from here. We default to utility-process when the pid map no longer has an entry for the process which can only happen when onDidExitOrCrashOrKill got invoked which inturn suggests that the main process did receive an exit or crash signal from these EH hosts. @jrieken can you check the main process logs next time if it has any of the following entries for the corresponding zombie EH pid to validate this data

deepak1556 commented 1 year ago

I attempted to setup a similar flow based on the stack with base Electron app but the utility process are cleaned up as expected. I can make a change in the destruction phase of the utility process to signal termination of isolate so that tasks which require to enter JS will not do so and I hope that resolves the long running draintask.

I won't be able to make a new insiders build with this change until I am back from vacation but I can provide a custom electron build for local testing with OSS. @jrieken would that help to validate incase you face similar issue with OSS as well ?

connor4312 commented 1 year ago

From the crash dumps we do know that the process already was shutting down

Ah, my bad, skimmed over that part. Disregard.

(Yes, it sends SIGUSR1 on unix)

jrieken commented 1 year ago

2023-10-06 10:26:53.084 [info] Extension host with pid 19510 exited with code: 0, signal: unknown.

I am seeing this for PID-19510 which is currently spinning forever. Samples for this one EH-19510.txt

nzhl commented 1 year ago

i think i have the same issue on my mac. It happens from time to time and my workaround is simply killing vscode and then reopen it. But sometimes i don't realize it until my Mac has run out of battery, which can be quite annoying.

CleanShot 2023-10-09 at 19 46 49@2x

CleanShot 2023-10-09 at 19 59 29@2x

ClockRide commented 1 year ago

Having the exact same issue. utility-process is running at 100% all the time on M1 Max, very annoying.

taoky commented 1 year ago

This bug might also exist in Linux build of vscode:

image

(6% * 16 cores = 96%)

Command line of "utility-process" is:

/opt/visual-studio-code/code --type=utility --utility-sub-type=node.mojom.NodeService --lang=en-US --service-sandbox-type=none --dns-result-order=ipv4first --inspect-port=0 --enable-crash-reporter=761fa1b3-23e9-422b-8845-dd1487654753,no_channel --user-data-dir=/home/taoky/.config/Code --standard-schemes=vscode-webview,vscode-file --enable-sandbox --secure-schemes=vscode-webview,vscode-file --bypasscsp-schemes --cors-schemes=vscode-webview,vscode-file --fetch-schemes=vscode-webview,vscode-file --service-worker-schemes=vscode-webview --streaming-schemes --shared-files=v8_context_snapshot_data:100 --field-trial-handle=0,i,15025945271492190020,16861348579807269700,262144 --disable-features=CalculateNativeWinOcclusion,SpareRendererForSitePerProcess

eu-stack of main thread of this process:

> sudo eu-stack --pid=850692 -m -1
TID 850692:
#0  0x00007fa28842de01 __pthread_mutex_lock@GLIBC_2.2.5 - /usr/lib/libc.so.6
#1  0x0000561c3ca4e189 uv_mutex_lock - /opt/visual-studio-code/code
#2  0x0000561c43774136 - /opt/visual-studio-code/code
#3  0x0000561c43775816 - /opt/visual-studio-code/code
#4  0x0000561c436486e9 node::FreeEnvironment(node::Environment*) - /opt/visual-studio-code/code
#5  0x0000561c3cb3e2a5 - /opt/visual-studio-code/code
#6  0x0000561c3cbdceac - /opt/visual-studio-code/code
#7  0x0000561c3cbdcf1e - /opt/visual-studio-code/code
#8  0x0000561c3cbded0a - /opt/visual-studio-code/code
#9  0x0000561c3ee41dd9 - /opt/visual-studio-code/code
#10 0x0000561c3fd177aa - /opt/visual-studio-code/code
#11 0x0000561c3eaf2bef - /opt/visual-studio-code/code
#12 0x0000561c3fd179a5 - /opt/visual-studio-code/code
#13 0x0000561c3fd23fc9 - /opt/visual-studio-code/code
#14 0x0000561c3f9c68ed - /opt/visual-studio-code/code
#15 0x0000561c3f9e4a95 - /opt/visual-studio-code/code
#16 0x0000561c3f9e44a9 - /opt/visual-studio-code/code
#17 0x0000561c3f9e51f5 - /opt/visual-studio-code/code
#18 0x0000561c3f983a6e - /opt/visual-studio-code/code
#19 0x0000561c3f9e555c - /opt/visual-studio-code/code
#20 0x0000561c3f9a7a61 - /opt/visual-studio-code/code
#21 0x0000561c3f6d93f4 - /opt/visual-studio-code/code
#22 0x0000561c3cd06ffd - /opt/visual-studio-code/code
#23 0x0000561c3cd08218 - /opt/visual-studio-code/code
#24 0x0000561c3cd05c65 - /opt/visual-studio-code/code
#25 0x0000561c3cd05d55 - /opt/visual-studio-code/code
#26 0x0000561c3ca53c0d - /opt/visual-studio-code/code
#27 0x00007fa2883c5cd0 __libc_start_call_main - /usr/lib/libc.so.6
#28 0x00007fa2883c5d8a __libc_start_main@@GLIBC_2.34 - /usr/lib/libc.so.6
#29 0x0000561c3c6b202a _start - /opt/visual-studio-code/code

Noticed that node::FreeEnvironment(node::Environment*) is also inside @jrieken's sampling.

taoky commented 1 year ago

output.txt

Manually sampling with eu-stack running every second, for one minute.

(And might related to https://github.com/nodejs/node/issues/47748 and https://github.com/nodejs/node/pull/47452)

sebdanielsson commented 1 year ago

Sorry if I'm hijacking the thread but would uninstalling Node and installing Bun solve this problem? Doesn't sound like it's a VSCode bug, right?

deepak1556 commented 1 year ago

Thanks for the logs @taoky , the stack does look related. I wouldn't be surprised if the issue is also present on Linux since the shutdown path is same for both.

Maktel commented 1 year ago

Similar issue on my M1 Max, sometimes I observe even 3/4 zombie processes running in the background and have to kill them manually from process manager to stop them from draining battery.

2 more samples for such process: sample-1.txt sample-2.txt

taoky commented 1 year ago

Checked log and it looks weird.

/home/username/.config/Code/logs/20231022T205712/main.log:

2023-10-24 03:44:26.258 [info] Extension host with pid 1331811 exited with code: 0, signal: unknown.

However, PID 1331811 still exists and is running at 100%.

taoky commented 1 year ago

Checked log and it looks weird.

Another hang process today with log:

> rg 1706525
20231022T205712/window9/exthost/exthost.log
1:2023-10-24 15:40:45.945 [info] Extension host with pid 1706525 started
326:2023-10-24 16:00:07.408 [info] Extension host with pid 1706525 exiting with code 0

20231022T205712/main.log
17:2023-10-24 16:00:07.409 [info] Extension host with pid 1706525 exited with code: 0, signal: unknown.

"Extension host with pid 1706525 started" => https://github.com/microsoft/vscode/blob/a8b7239d0311d4915b57c837972baf4b01394491/src/vs/workbench/api/common/extensionHostMain.ts#L171

"Extension host with pid 1706525 exiting with code 0" => https://github.com/microsoft/vscode/blob/a8b7239d0311d4915b57c837972baf4b01394491/src/vs/workbench/api/common/extHostExtensionService.ts#L271

"Extension host with pid 1706525 exited with code: 0, signal: unknown." => https://github.com/microsoft/vscode/blob/a8b7239d0311d4915b57c837972baf4b01394491/src/vs/platform/extensions/electron-main/extensionHostStarter.ts#L76

fenvalle commented 1 year ago

Is there any temp workaround? any script or bash command that we can run to identify and kill those pid? I am interested to do it manually, because i have a M2 and i am rebooting my macbook after some hours because get soooo slow

NullEnt1ty commented 1 year ago

Is there any temp workaround? any script or bash command that we can run to identify and kill those pid? I am interested to do it manually, because i have a M2 and i am rebooting my macbook after some hours because get soooo slow

Restarting VS Code should already "fix" the problem because the extension host is a child process of the application. At least that does the job for me. Are you observing a different behavior?

Note that "Reload Window" does not kill the zombie process (according to my observation).

jrieken commented 1 year ago

@NullEnt1ty @fenvalle @taoky @Maktel @ClockRide @nzhl

Please help by posting the list of installed extensions. I am looking for similarities between our setups as this is possibly caused by an extension.

This is my list

amodio.tsl-problem-matcher
chrisdias.vscode-opennewinstance
connor4312.css-theme-completions
connor4312.esbuild-problem-matchers
dbaeumer.vscode-eslint
dtsvet.vscode-wasm
eamodio.gitlens
esbenp.prettier-vscode
GitHub.codespaces
GitHub.copilot
GitHub.copilot-chat
GitHub.github-vscode-theme
GitHub.remotehub
GitHub.vscode-pull-request-github
ipatalas.vscode-postfix-ts
jrieken.browser-links
jrieken.esm-helper
jrieken.references-plusplus
jrieken.regex-notebook
jrieken.vscode-pr-pinger
jrieken.vscode-tree-sitter-query
kisstkondoros.vscode-codemetrics
kisstkondoros.vscode-gutter-preview
meganrogge.template-string-converter
ms-python.python
ms-python.vscode-pylance
ms-vscode-remote.remote-containers
ms-vscode-remote.remote-ssh
ms-vscode-remote.remote-ssh-edit
ms-vscode.anycode
ms-vscode.anycode-c-sharp
ms-vscode.anycode-cpp
ms-vscode.anycode-go
ms-vscode.anycode-java
ms-vscode.anycode-kotlin
ms-vscode.anycode-php
ms-vscode.anycode-python
ms-vscode.anycode-rust
ms-vscode.anycode-typescript
ms-vscode.hexeditor
ms-vscode.remote-explorer
ms-vscode.remote-repositories
ms-vscode.remote-server
ms-vscode.vscode-github-issue-notebooks
ms-vscode.vscode-markdown-notebook
ms-vscode.vscode-selfhost-test-provider
ms-vscode.vscode-typescript-next
rust-lang.rust-analyzer
wayou.vscode-todo-highlight

To get this list run code --list-extensions or code-insiders --list-extensions. Please do the same and post your results here. Thanks

taoky commented 1 year ago
> code --list-extensions
aaronduino.chisel
bbenoist.Nix
coolbear.systemd-unit-file
cschlosser.doxdocgen
eamodio.gitlens
esbenp.prettier-vscode
GitHub.copilot
GitHub.copilot-chat
GitHub.copilot-labs
golang.go
haskell.haskell
James-Yu.latex-workshop
jeff-hykin.better-cpp-syntax
justusadam.language-haskell
kokororin.vscode-phpfmt
llvm-vs-code-extensions.vscode-clangd
llvm-vs-code-extensions.vscode-mlir
luniclynx.bison
luniclynx.lex
ms-azuretools.vscode-docker
ms-python.isort
ms-python.python
ms-python.vscode-pylance
ms-toolsai.jupyter
ms-toolsai.jupyter-keymap
ms-toolsai.jupyter-renderers
ms-toolsai.vscode-jupyter-cell-tags
ms-toolsai.vscode-jupyter-slideshow
ms-vscode-remote.remote-ssh
ms-vscode-remote.remote-ssh-edit
ms-vscode.cmake-tools
ms-vscode.cpptools
ms-vscode.cpptools-extension-pack
ms-vscode.cpptools-themes
ms-vscode.hexeditor
ms-vscode.live-server
ms-vscode.remote-explorer
ms-vsliveshare.vsliveshare
ms-vsliveshare.vsliveshare-pack
nvarner.typst-lsp
prince781.vala
rust-lang.rust-analyzer
sswg.swift-lang
twxs.cmake
vadimcn.vscode-lldb
Vue.volar
wholroyd.jinja
markusylisiurunen commented 1 year ago

Thanks for looking into this! I am also facing this issue quite frequently. Here are my installed extensions:

alefragnani.project-manager
Dart-Code.dart-code
Dart-Code.flutter
dbaeumer.vscode-eslint
denoland.vscode-deno
EditorConfig.EditorConfig
esbenp.prettier-vscode
GitHub.copilot
GitHub.copilot-chat
golang.go-nightly
henriiik.vscode-sort
mikestead.dotenv
stripe.markdoc-language-support
tamasfe.even-better-toml
Version: 1.84.0-insider (Universal)
Commit: 7dff07893ac160244db210e574eccc05f1cba2cb
Date: 2023-10-24T05:35:10.309Z
Electron: 25.9.2
ElectronBuildId: 24603566
Chromium: 114.0.5735.289
Node.js: 18.15.0
V8: 11.4.183.29-electron.0
OS: Darwin arm64 23.0.0
sebdanielsson commented 1 year ago
1Password.op-vscode
bierner.github-markdown-preview
bierner.markdown-checkbox
bierner.markdown-emoji
bierner.markdown-footnotes
bierner.markdown-mermaid
bierner.markdown-yaml-preamble
bradlc.vscode-tailwindcss
Brigit.devcontainer-image-convert
charliermarsh.ruff
christian-kohler.npm-intellisense
DavidAnson.vscode-markdownlint
dbaeumer.vscode-eslint
denoland.vscode-deno
esbenp.prettier-vscode
GitHub.codespaces
GitHub.copilot
GitHub.copilot-chat
GitHub.remotehub
github.vscode-github-actions
GitHub.vscode-pull-request-github
GrapeCity.gc-excelviewer
hashicorp.terraform
hediet.vscode-drawio
jakeboone02.cypher-query-language
mikestead.dotenv
ms-azuretools.vscode-docker
ms-dotnettools.vscode-dotnet-runtime
ms-kubernetes-tools.vscode-kubernetes-tools
ms-python.autopep8
ms-python.isort
ms-python.python
ms-python.vscode-pylance
ms-toolsai.jupyter
ms-toolsai.jupyter-keymap
ms-toolsai.jupyter-renderers
ms-toolsai.vscode-jupyter-cell-tags
ms-toolsai.vscode-jupyter-slideshow
ms-vscode-remote.remote-containers
ms-vscode-remote.remote-ssh
ms-vscode-remote.remote-ssh-edit
ms-vscode-remote.vscode-remote-extensionpack
ms-vscode.azure-repos
ms-vscode.cmake-tools
ms-vscode.cpptools
ms-vscode.cpptools-extension-pack
ms-vscode.makefile-tools
ms-vscode.remote-explorer
ms-vscode.remote-repositories
ms-vscode.remote-server
ms-vscode.test-adapter-converter
ms-vscode.vscode-markdown-notebook
ms-vscode.wordcount
ms-vsliveshare.vsliveshare
mtxr.sqltools
mtxr.sqltools-driver-pg
Oracle.oracledevtools
ParallelsDesktop.parallels-desktop
pomdtr.excalidraw-editor
redhat.ansible
redhat.vscode-yaml
ritwickdey.LiveServer
snyk-security.snyk-vulnerability-scanner
Tailscale.vscode-tailscale
timonwong.shellcheck
tomoki1207.pdf
twxs.cmake
Tyriar.sort-lines
VisualStudioExptTeam.intellicode-api-usage-examples
VisualStudioExptTeam.vscodeintellicode
VisualStudioExptTeam.vscodeintellicode-completions
vscode-icons-team.vscode-icons
waderyan.gitblame
jrieken commented 1 year ago

Thanks everyone. Looks like the intersection is

esbenp.prettier-vscode
GitHub.copilot
GitHub.copilot-chat
fenvalle commented 1 year ago

code --list-extensions

bibhasdn.unique-lines
earshinov.sort-lines-by-selection
j-brooke.fracturedjsonvsc
LoranKloeze.ruby-rubocop-revived
MateuszDrewniak.ruby-test-runner
mgmcdermott.vscode-language-babel
ms-vscode.vscode-typescript-next
sburg.vscode-javascript-booster
selfrefactor.order-props
Shopify.ruby-lsp
NullEnt1ty commented 1 year ago

@jrieken

Thanks everyone. Looks like the intersection is

esbenp.prettier-vscode
GitHub.copilot
GitHub.copilot-chat

Yep, I'm using those extensions too. Here's the full list:

Armitxes.sqf
astro-build.astro-vscode
bungcip.better-toml
CodeQue.codeque
darkriszty.markdown-table-prettify
dbaeumer.vscode-eslint
DotJoshJohnson.xml
eamodio.gitlens
EditorConfig.EditorConfig
emmanuelbeziat.vscode-great-icons
esbenp.prettier-vscode
GitHub.copilot
GitHub.remotehub
goz3rr.vscode-glualint
KatjanaKosic.vscode-json5
MS-CEINTL.vscode-language-pack-de
ms-python.isort
ms-python.python
ms-python.vscode-pylance
ms-vscode-remote.remote-ssh
ms-vscode-remote.remote-ssh-edit
ms-vscode-remote.remote-wsl
ms-vscode.azure-repos
ms-vscode.hexeditor
ms-vscode.remote-explorer
ms-vscode.remote-repositories
ms-vsliveshare.vsliveshare
nhoizey.gremlins
nidu.copy-json-path
platformio.platformio-ide
redhat.vscode-commons
redhat.vscode-yaml
ryu1kn.partial-diff
stkb.rewrap
timonwong.shellcheck
venner.vscode-glua-enhanced
VisualStudioExptTeam.intellicode-api-usage-examples
VisualStudioExptTeam.vscodeintellicode
wmaurer.change-case
zhuangtongfa.material-theme
taoky commented 1 year ago

https://github.com/microsoft/vscode/pull/196809 is not a very good workaround: what if the process exits, but another innocent process just takes this PID?

paulrutter commented 1 year ago

196809 is not a very good workaround: what if the process exits, but another innocent process just takes this PID?

It probably cannot do that as the pid is still taken? I agree it's not a great workaround, but until the root cause is clear it does solve the issue where a cpu is occupied.