pantsbuild / pants

The Pants Build System
https://www.pantsbuild.org
Apache License 2.0
3.33k stars 637 forks source link

Should Python linters/checkers stream their partitioned results? #13380

Open Eric-Arellano opened 3 years ago

Eric-Arellano commented 3 years ago

For Python, we may return >1 result. For example, MyPy and Flake8 can partition by interpreter constraints. When that happens, we don't dump the results until all partitions have run. Instead, we could do something like https://github.com/pantsbuild/pants/pull/13379#pullrequestreview-791200100 to stream results.

However, I expect >90% of users don't use this partition feature. It would be confusing to render the per-partition result when there is only one partition, given that we will also still render CheckResults and LintResults due to the code in check.py and lint.py. So, we should probably only stream "partitions" when they are actually used.

This question also applies to the option --lint-per-file-caching.

--

Concretely, we might want to remove the CheckResults and LintResults feature in favor of having to return a single CheckResult/LintResult a la https://github.com/pantsbuild/pants/pull/13379#pullrequestreview-791200100. Let each plugin determine if it wants to stream. (Although, that would make support for --lint-per-file-caching harder to implement)

stuhood commented 3 years ago

Concretely, we might want to remove the CheckResults and LintResults feature in favor of having to return a single CheckResult/LintResult a la #13379 (review). Let each plugin determine if it wants to stream. (Although, that would make support for --lint-per-file-caching harder to implement)

Ideally the various linters could be oblivious to the number of partitions that are created, and instead just operate on whatever they're given... that would hopefully mean they wouldn't need to care about how many instances there were. But I think that that would require making the partitioning constraints more declarative, so that lint/check could execute the partitioning before calling the tools.

This relates to a conversation from the other day: it would be great to be able to optimize partition sizes independently of the constraints (one instance of yapf is too few on a 64 core machine, while one per file is too many: https://pantsbuild.slack.com/archives/C046T6T9U/p1635271709151400), and removing most of that logic from individual linters would help enable that. Then we could play with partition sizes independently of constraints.

stuhood commented 2 years ago

Relates to #13462.

But: yes: we probably should stream the results of individual partitions, in addition to cleaning them up / making them quieter by default, á la https://github.com/pantsbuild/pants/pull/14129.