gristlabs / asttokens

Annotate Python AST trees with source text and token information
Apache License 2.0
172 stars 34 forks source link

CI is absurdly slow #129

Closed PeterJCLaw closed 1 year ago

PeterJCLaw commented 1 year ago

CI seems to take ~30 minutes to run pytest, which locally takes about 10 seconds. Do we have any idea what's causing this?

alexmojaki commented 1 year ago

Because CI has ASTTOKENS_SLOW_TESTS=1.

alexmojaki commented 1 year ago

It looks like https://github.com/gristlabs/asttokens/pull/128 does speed it up, which makes sense because the main slow test runs twice, once for ast and once for astroid. So getting that working would help. If it's possible to effectively use more than 2 workers then parametrising that test could help further.

It might also help to run the slow tests without coverage, and if that causes coverage to drop then we should be writing normal deterministic tests to fill the gap.

PeterJCLaw commented 1 year ago

It looks like the initial speedup from #128 was in part due to it running without coverage. That was actually what was failing. Adding in pytest-cov fixes that, but the tests are slow again. I like the idea of the slow tests running without coverage.

Separately it looks like the slow tests are in fact all in a single test case, which means that parallelisation can only help so far with that. Adding in parameterized kinda helps with that, however the ordering of the sys.modules isn't stable and the tests can flake out. Applying sorting helps, but I'm not sure if we'd see divergences in the members of sys.modules between workers too. Needs more investigation.

alexmojaki commented 1 year ago

It's a single test method, but it lives in TestMarkTokens which is subclassed by TestAstroid so there are two versions of the test that run. But I don't know how the division of tests works, so maybe a single worker was allocated both of them.

PeterJCLaw commented 1 year ago

pytest-xdist has a number of strategies for worker allocation. My understanding of the default is that workers for tests are allocated in groups on the fly rather than upfront, so I think in general the two long tests should end up on separate workers. I think we could also explicitly configure those tests to end up on separate workers with pytest marks. Docs for the strategies are at https://pytest-xdist.readthedocs.io/en/stable/distribution.html

PeterJCLaw commented 1 year ago

Ah, we might want to be using --dist worksteal if we see highly variable times.

PeterJCLaw commented 1 year ago

Is the --junitxml=./rspec.xml used for anything? I can't see a use for it, but I don't know if it's somehow implicitly needed by coveralls or something.

PeterJCLaw commented 1 year ago

Another idea here might be to break the slow tests out into their own CI job. That would mean that things like coverage, linting & the main suite of tests could give their feedback sooner, while still running the slower tests.