Closed AlexWaygood closed 1 year ago
Also looks like some changes might now be being reported twice in the diff, for some reason: https://github.com/python/typeshed/pull/9308#issuecomment-1336169558
I think 592a1f7972989e967da17bbbfb851e27aea9202a may have made things worse. Shard 0 is the slowest by far and includes a lot of projects (https://github.com/python/typeshed/actions/runs/3608748067/jobs/6081526467). Shard 0 gets the slowest project (pandas), with cost 120, as well as a lot of small projects with the default cost of 3. Apparently that works out to an unbalanced distribution.
I do think the approach in the commit is correct if the input data is correct and precise, but unfortunately it doesn't seem to be. Perhaps we could set something up where mypy-primer runs record their performance somewhere, and later runs use that data to figure out the sharding strategy. That seems quite tricky though. Failing that, it may be better to return to random sharding.
And not sure what's up with the comment being posted twice.
Oh it's much simpler: shard 0 includes all projects. That's why comtypes is in both shard 0 and shard 1: https://github.com/python/typeshed/actions/runs/3608748067/jobs/6081526467, https://github.com/python/typeshed/actions/runs/3608748067/jobs/6081526518.
mypy_primer jobs seem to be taking 30+ minutes (or more) on typeshed PRs today. The norm previously was around ~12-13 minutes.
https://github.com/python/typeshed/actions/workflows/mypy_primer.yml