Open ahal opened 2 years ago
Hi @ahal,
I ran a profiler (pytest-profiling) on the tests and figured out that the subprocess.check_output()
function is the main contributer to the runtime.
It is called by the Repository.run()
method to run different commands like add
, commit
, reset
, etc. Here is the output of the profiler:
ncalls tottime percall cumtime percall filename:lineno(function)
52 0.001 0.000 22.405 0.431 runner.py:109(pytest_runtest_protocol)
52 0.002 0.000 22.392 0.431 runner.py:117(runtestprotocol)
156 0.001 0.000 22.389 0.144 runner.py:217(call_and_report)
2123/572 0.005 0.000 22.381 0.039 _hooks.py:244(__call__)
2123/572 0.002 0.000 22.379 0.039 _manager.py:77(_hookexec)
2123/572 0.015 0.000 22.378 0.039 _callers.py:9(_multicall)
156 0.001 0.000 22.357 0.143 runner.py:245(call_runtest_hook)
156 0.002 0.000 22.355 0.143 runner.py:316(from_call)
156 0.001 0.000 22.352 0.143 runner.py:260(<lambda>)
321 0.023 0.000 21.337 0.066 subprocess.py:417(check_output)
321 0.030 0.000 21.313 0.066 subprocess.py:506(run)
313 0.004 0.000 20.940 0.067 vcs.py:40(run)
321 0.021 0.000 20.673 0.064 subprocess.py:1163(communicate)
623 19.903 0.032 19.909 0.032 {method 'read' of '_io.TextIOWrapper' objects}
52 0.000 0.000 19.403 0.373 runner.py:158(pytest_runtest_call)
52 0.000 0.000 19.401 0.373 python.py:1787(runtest)
52 0.001 0.000 19.398 0.373 python.py:188(pytest_pyfunc_call)
32 0.001 0.000 4.690 0.147 vcs.py:254(get_changed_files)
2 0.000 0.000 3.414 1.707 test_util_vcs.py:465(test_workdir_outgoing)
52 0.000 0.000 2.870 0.055 runner.py:153(pytest_runtest_setup)
I don't know how to properly optimize it though.
Hi @miladHakimi, sorry I totally missed this mention in my notifications.
I'm not too surprised by this finding, the tests themselves make a lot of calls out to the underlying version control system (hg or git) to funciton. So any calls to the VCS will be captured in the check_output
call. For performance optimizations, we'll want to look into things like parallelization, reducing setup steps or looking into how to speed up the actual vcs (e.g, maybe use chg
or a command server for Mercurial).
Let me know if you're still interested in working on this and we can dive into some more details!
I'm interested in solving this issue. I can start exploring the code and look for areas of improvement including the things you mentioned. Then, write a small proposal of what to do so that we can iterate over it to find something that can be useful.
This test takes awhile, and since we run it across multiple Python versions, is the main reason why the
unit
task takes so long to run. It would be nice to figure out why it's so slow and see if we can speed it up at all.