Open simon-weber opened 6 years ago
Most of this ends up being explained by pyc+pytest cache write times.
@simon-weber shouldnt we still take a look at resolving this - there is ways to make this more efficient i believe (i mean why cant we do the assertion rewriting on the coordinating process for example)
i mean why cant we do the assertion rewriting on the coordinating process for example
You mean on the master
node? This implies collecting on the master
first, I believe.
Ah, if you think there's potential for improvement I'm happy to help. I closed this since the more I looked into it the more I realized I just didn't understand what was going on (especially around assertion rewriting).
In case it's helpful, here's the biggest things I noticed from some profiling:
find_module
here was also a ~2x speedupreopened as structural issues on our side have clearly been demonstrated
This has an interesting interaction with coverage collection. Since the coverage settrace needs to happen before collection, the trace overhead applies there as well -- even if coverage isn't being calculated for collected modules. In my little benchmarks, using --cov for a dummy file with one line slows collection down by up to 50%.
We've been running the patch from https://github.com/pytest-dev/pytest-xdist/issues/299 successfully for a while. One thing I've noticed is that with large test suites, it can take a while before the first worker begins running tests.
From what I can tell, this is due to collection overhead: with execnet debugging on, that's all that's going on during that time. Disabling workers from sending pytest_collectreport keeps the master idle during the time but doesn't speed things up, so presumably the bottleneck is the workers themselves. The collection time also does not seem affected by the number of workers.
Interestingly, collection under xdist seems much slower than just pytest: in my example of ~16k tests, it's ~1 minute vs ~4 minutes. My guess is there's something like https://github.com/pytest-dev/pytest-xdist/issues/279 going on.
Next I'm hoping to try profiling the workers during collection, but since that looks like it'll be painful I figured I'd ask for advice first. Is there anything notable that xdist collection does differently from pytest?