microsoft / vscode-python

Python extension for Visual Studio Code
https://aka.ms/pvsc-marketplace
MIT License
4.29k stars 1.18k forks source link

Python Rerun Failed Tests makes system unresponsive if you have lots of failed tests #21844

Closed tibbe closed 8 months ago

tibbe commented 1 year ago

Type: Performance Issue

  1. Have a larger number of failed Python tests from a previous full test run (e.g. 10-30 failed tests). The tests might have to be somewhat "chunky" (e.g. take a couple of seconds each, perhaps launch a thread or two. I used unit tests for FastAPI routes).

  2. Run "Rerun Failed Tests".

  3. CPU spikes to 100% (on M1 Mac) and VSCode needs to be killed for system to become responsive again.

The issue seems to be that many test processes will be launched in parallel (unlike when running all tests by pressing the top-level play button, which only launches one), DOS-ing the system. Is suspect that each pytest process probably launches a few threads for some "internal" parallelism as well.

To summarize: just running all tests launches 1 test runner (e.g. process) and runs lots of tests fine (presumably by limiting parallelism internally). Running "Run Failed Tests" launches N test runners, DOS-ing the system if you have enough failed tests.

Typical use case: run all your tests, find a bug, fix the bug, and try to rerun just the failed tests.

VS Code version: Code 1.81.1 (Universal) (6c3e3dba23e8fadc360aed75ce363ba185c49794, 2023-08-09T22:20:33.924Z) OS version: Darwin arm64 22.5.0 Modes:

System Info |Item|Value| |---|---| |CPUs|Apple M1 Pro (8 x 24)| |GPU Status|2d_canvas: enabled
canvas_oop_rasterization: disabled_off
direct_rendering_display_compositor: disabled_off_ok
gpu_compositing: enabled
metal: disabled_off
multiple_raster_threads: enabled_on
opengl: enabled_on
rasterization: enabled
raw_draw: disabled_off_ok
video_decode: enabled
video_encode: enabled
vulkan: disabled_off
webgl: enabled
webgl2: enabled
webgpu: enabled| |Load (avg)|5, 3, 2| |Memory (System)|16.00GB (0.06GB free)| |Process Argv|--crash-reporter-id ed2662a8-8445-4e6c-a43a-bb8dbcde152c| |Screen Reader|no| |VM|0%|
Process Info ``` CPU % Mem MB PID Process 22 115 52442 code main 9 66 52445 gpu-process 0 16 52448 utility-network-service 19 295 52450 window [1] (task_handler.py — benetics (Workspace)) 19 49 52782 shared-process 0 0 90115 /bin/ps -ax -o pid=,ppid=,pcpu=,pmem=,command= 0 16 52888 ptyHost 0 0 52890 /bin/zsh -il 0 33 89567 fileWatcher [1] 0 82 89568 extensionHost [1] 0 16 89591 /Users/tibbe/.pyenv/versions/3.10.12/bin/python /Users/tibbe/.vscode/extensions/charliermarsh.ruff-2023.32.0-darwin-arm64/bundled/tool/server.py 0 0 89592 /Users/tibbe/.pyenv/versions/3.10.12/bin/python /Users/tibbe/.vscode/extensions/ms-python.black-formatter-2023.4.1/bundled/tool/lsp_server.py 0 0 89594 /Users/tibbe/work/benetics/backend/.venv/bin/python /Users/tibbe/.vscode/extensions/ms-python.black-formatter-2023.4.1/bundled/tool/lsp_runner.py 0 0 89613 /Applications/Xcode.app/Contents/Developer/Library/Frameworks/Python3.framework/Versions/3.9/Resources/Python.app/Contents/MacOS/Python /Users/tibbe/.vscode/extensions/ms-python.black-formatter-2023.4.1/bundled/tool/lsp_runner.py 0 0 89631 /Applications/Xcode.app/Contents/Developer/Library/Frameworks/Python3.framework/Versions/3.9/Resources/Python.app/Contents/MacOS/Python /Users/tibbe/.vscode/extensions/ms-python.black-formatter-2023.4.1/bundled/tool/lsp_runner.py 0 0 89635 /Applications/Xcode.app/Contents/Developer/Library/Frameworks/Python3.framework/Versions/3.9/Resources/Python.app/Contents/MacOS/Python /Users/tibbe/.vscode/extensions/ms-python.black-formatter-2023.4.1/bundled/tool/lsp_runner.py 0 0 89656 /opt/homebrew/Cellar/python@3.11/3.11.4_1/Frameworks/Python.framework/Versions/3.11/Resources/Python.app/Contents/MacOS/Python /Users/tibbe/.vscode/extensions/ms-python.black-formatter-2023.4.1/bundled/tool/lsp_runner.py 0 0 89593 /Users/tibbe/.pyenv/versions/3.10.12/bin/python /Users/tibbe/.vscode/extensions/ms-python.mypy-type-checker-2023.2.0/bundled/tool/lsp_server.py 0 49 89728 electron-nodejs (/Applications/Visual Studio Code.app/Contents/Frameworks/Code Helper (Plugin).app/Contents/MacOS/Code Helper (Plugin) --ms-enable-electron-run-as-node /Users/tibbe/.vscode/extensions/ms-python.vscode-pylance-2023.8.20/dist/server.bundle.js --cancellationReceive=file:81d676d46acf19e914fd1251a11e6446caf77b58bd --node-ipc --clientProcessId=89568) 0 16 90045 /Users/tibbe/work/benetics/backend/.venv/bin/python -m pytest --rootdir /Users/tibbe/work/benetics/backend --override-ini junit_family=xunit1 --junit-xml=/var/folders/pg/g5sspw6j71927rpj68ktyhvm0000gn/T/tmp-89568VvF7JTSk2eTF.xml 99 1769 90046 /Users/tibbe/work/benetics/backend/.venv/bin/python -u -c import sys;exec(eval(sys.stdin.readline())) 0 754 90047 /Users/tibbe/work/benetics/backend/.venv/bin/python -u -c import sys;exec(eval(sys.stdin.readline())) 0 0 90106 /Users/tibbe/work/benetics/backend/.venv/bin/python -c from multiprocessing.resource_tracker import main;main(17) 0 16 90048 /Users/tibbe/work/benetics/backend/.venv/bin/python -u -c import sys;exec(eval(sys.stdin.readline())) 0 16 90049 /Users/tibbe/work/benetics/backend/.venv/bin/python -u -c import sys;exec(eval(sys.stdin.readline())) 17 98 90113 window [3] (Issue Reporter) ```
Workspace Info ``` | Window (task_handler.py — benetics (Workspace)) | Folder (backend): more than 21179 files | File types: py(6554) json(5093) pyc(1648) pyi(1113) gz(524) so(310) | txt(147) typed(82) test(67) h(63) | Conf files: makefile(2) dockerfile(1) package.json(1) | Folder (app): more than 20409 files | File types: hpp(14349) h(1309) so(554) cpp(279) ipp(253) m(145) | flat(138) xml(133) json(128) S(78) | Conf files: package.json(1) tsconfig.json(1) | Folder (web): 254 files | File types: tsx(86) ts(57) json(20) yml(6) md(4) cjs(3) js(3) png(3) | env(1) development(1) | Conf files: package.json(1) tsconfig.json(1) | Folder (packages): 64 files | File types: ts(48) md(2) json(2) gitignore(1) js(1) cjs(1) | Conf files: package.json(1) | Folder (.github): 16 files | File types: yml(15) | Conf files: | Folder (scripts): 17 files | File types: yml(4) py(4) md(3) sh(1) json(1) | Conf files: | Folder (translations): 2024 files | File types: py(1095) pyc(306) json(134) txt(39) exe(14) typed(11) 11(6) | gitignore(5) rst(5) TAG(3) | Conf files: settings.json(1); ```
Extensions (15) Extension|Author (truncated)|Version ---|---|--- markdown-preview-github-styles|bie|2.0.2 boto3-ide|Bot|0.5.4 ruff|cha|2023.32.0 vscode-eslint|dba|2.4.2 es7-react-js-snippets|dsz|4.4.3 copilot|Git|1.103.338 vscode-github-actions|git|0.26.1 vscode-pull-request-github|Git|0.70.0 black-formatter|ms-|2023.4.1 mypy-type-checker|ms-|2023.2.0 python|ms-|2023.14.0 vscode-pylance|ms-|2023.8.20 vscode-jest|Ort|5.2.3 rewrap|stk|1.16.3 even-better-toml|tam|0.19.2
A/B Experiments ``` vsliv368cf:30146710 vsreu685:30147344 python383:30185418 vspor879:30202332 vspor708:30202333 vspor363:30204092 vslsvsres303:30308271 vserr242:30382549 pythontb:30283811 vsjup518:30340749 pythonptprofiler:30281270 vshan820:30294714 vstes263:30335439 vscoreces:30445986 vscod805cf:30301675 binariesv615:30325510 bridge0708:30335490 bridge0723:30353136 vsaa593cf:30376535 pythonvs932:30410667 vscaac:30438847 vsclangdc:30486549 c4g48928:30535728 dsvsc012:30540252 pynewext54:30695312 azure-dev_surveyone:30548225 vscccc:30803845 282f8724:30602487 f6dab269:30613381 2i9eh265:30646982 showlangstatbar:30737416 03d35959:30757346 pythonfmttext:30731395 pythoncmv:30756943 fixshowwlkth:30771522 showindicator:30805244 pythongtdpath:30769146 i26e3531:30792625 gsofb:30804716 pythonnosmt12:30797651 pythonidxptcf:30805731 pythonnoceb:30805159 dsvsc013:30795093 dsvsc014:30804076 diffeditorv2:30812749 ```
connor4312 commented 1 year ago

@eleanorjboyd this sounds like a problem in the way Python is running tests. VS Code will include each individual test that fails, sounds like Python tries to run everything with maximum concurrency. Let me know if you think this is a core issue instead.

tibbe commented 1 year ago

@eleanorjboyd this sounds like a problem in the way Python is running tests. VS Code will include each individual test that fails, sounds like Python tries to run everything with maximum concurrency. Let me know if you think this is a core issue instead.

Well if VSCode launches the pytest executable N times the total parallelism isn't something pytest can easily control (it would have to use some filesystem-based semaphore, which is probably doesn't). It can only control how much it runs in parallel per invocation.

From the VSCode logging that indeed seems to be what happens, as it seems to call the pytest executable N times. If you look in Activity Monitor you can also see that a lot of Python processes are launched.

eleanorjboyd commented 1 year ago

Hello! Is this issue occurring on the new python testing rewrite? You can check by looking for Experiment 'pythonTestAdapter' at the top of your logs in your python output. You likely will need to set your log level to trace in order for this message to get printed.

Let me know either way as this will help me narrow down where you are seeing this issue. Thanks

tibbe commented 1 year ago

Hello! Is this issue occurring on the new python testing rewrite? You can check by looking for Experiment 'pythonTestAdapter' at the top of your logs in your python output. You likely will need to set your log level to trace in order for this message to get printed.

Let me know either way as this will help me narrow down where you are seeing this issue. Thanks

It doesn't seem like I'm in that experiment:

2023-08-16 16:57:32.122 [info] Experiment 'pythonPromptNewFormatterExt' is active
2023-08-16 16:57:32.122 [info] Experiment 'pythonPromptNewToolsExt' is active
2023-08-16 16:57:32.122 [info] Test server listening.
eleanorjboyd commented 1 year ago

could you opt-in and see if that helps? you can do so by putting this in your settings"python.experiments.optInto": ["pythonTestAdapter"].

tibbe commented 1 year ago

Some more debugging. When rerunning 53 failed tests a ton of Python processes are spawned:

$  ps aux | grep python | wc -l
287

Before running the test:

ps aux | grep python | wc -l
26

Looking at one of these it seems it uses 5 threads:

ps M 44981
USER    PID   TT   %CPU STAT PRI     STIME     UTIME COMMAND
tibbe 44981   ??    0.3 S    31T   0:00.12   0:00.20 /Users/tibbe/work/benetics/backend/.venv/bin/python -m pytest --rootdir /Users/tibbe/work/benetics/backend --override-ini junit_family=xunit1 --junit-xml=/var/folders/pg/g5sspw6j71927rpj68ktyhvm0000gn/T/tmp-43648HWEMnkS7Ck73.xml 
      44981         0.0 S    31T   0:00.00   0:00.00 
      44981         0.0 S    31T   0:00.00   0:00.00 
      44981         0.0 S    31T   0:00.00   0:00.00 
      44981         0.0 S    31T   0:00.00   0:00.00 

If that carries over to every process then ~1,300 threads get launched at the same time.

Switching on "python.experiments.optInto": ["pythonTestAdapter"] does seem to fix the problem and ps confirms we have way fewer python processes. The only drawback I noticed is that running the whole batch (not just rerunning failed tests) seems a bit slower with some stragglers (i.e. a few tests take longer to finish than the rest, although the actual tests aren't bigger in any way).

eleanorjboyd commented 1 year ago

Hi! Glad the experiment fixed the first issue and will investigate the bug where "rerun failed tests" runs the whole batch of tests. Thanks

tibbe commented 1 year ago

Sorry for being unclear, rerun failed doesn't run all the tests. What I meant was that using the experimental runner, running all the tests seemed somewhat slower.

eleanorjboyd commented 1 year ago

Ah yes understood! Thanks for clarifying. The rewrite allows for dynamic results, meaning now we return tests as they individually finish instead of once all the tests run are finished. Therefore it is a bit harder to compare since we didn't have that originally. How big is the time difference in the new vs old and what type of tests are the straglers (unit, parameterized, etc). Also are these the first or last tests that are taking longer? Wondering if you are seeing the extra setup or cleanup time taken to start the process / kill it which would not apply to the tests run in the middle.

tibbe commented 1 year ago

Is there an easy way to measure the end-to-end time except for staring at the UI?

eleanorjboyd commented 1 year ago

@karthiknadig thoughts on the above question?

eleanorjboyd commented 9 months ago

bump @karthiknadig

eleanorjboyd commented 9 months ago

Hi @tibbe, Karthik is on vacation but in the meantime I think this will help you determine the timing: https://github.com/microsoft/vscode/wiki/Performance-Issues#profiling-the-renderer-process

karthiknadig commented 9 months ago

@eleanorjboyd That is the correct way to get profiling for this scenario.

github-actions[bot] commented 8 months ago

Because we have not heard back with the information we requested, we are closing this issue for now. If you are able to provide the info later on, then we will be happy to re-open this issue to pick up where we left off.

Happy Coding!