pytroll / pyresample

Geospatial image resampling in Python
http://pyresample.readthedocs.org
GNU Lesser General Public License v3.0
343 stars 95 forks source link

Add more pre-commit checks (mccabe, bandit, mypy, etc) #557

Closed djhoese closed 7 months ago

djhoese commented 8 months ago

Let's bring pyresample into the modern age of 5 years ago with some real complexity and type checking.

codecov[bot] commented 8 months ago

Codecov Report

Attention: 80 lines in your changes are missing coverage. Please review.

Comparison is base (d8f45cd) 94.13% compared to head (a7e1f25) 94.08%.

Files Patch % Lines
pyresample/utils/cf.py 68.18% 14 Missing :warning:
pyresample/kd_tree.py 91.72% 12 Missing :warning:
pyresample/plot.py 21.42% 11 Missing :warning:
pyresample/area_config.py 86.36% 9 Missing :warning:
pyresample/ewa/ewa.py 66.66% 9 Missing :warning:
pyresample/gradient/__init__.py 0.00% 6 Missing :warning:
pyresample/slicer.py 0.00% 5 Missing :warning:
pyresample/bilinear/__init__.py 0.00% 3 Missing :warning:
pyresample/bilinear/xarr.py 0.00% 2 Missing :warning:
pyresample/future/resamplers/nearest.py 60.00% 2 Missing :warning:
... and 5 more
Additional details and impacted files ```diff @@ Coverage Diff @@ ## main #557 +/- ## ========================================== - Coverage 94.13% 94.08% -0.06% ========================================== Files 84 84 Lines 13188 13228 +40 ========================================== + Hits 12415 12446 +31 - Misses 773 782 +9 ``` | [Flag](https://app.codecov.io/gh/pytroll/pyresample/pull/557/flags?src=pr&el=flags&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=pytroll) | Coverage Δ | | |---|---|---| | [unittests](https://app.codecov.io/gh/pytroll/pyresample/pull/557/flags?src=pr&el=flag&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=pytroll) | `94.08% <79.48%> (-0.06%)` | :arrow_down: | Flags with carried forward coverage won't be shown. [Click here](https://docs.codecov.io/docs/carryforward-flags?utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=pytroll#carryforward-flags-in-the-pull-request-comment) to find out more.

:umbrella: View full report in Codecov by Sentry.
:loudspeaker: Have feedback on the report? Share it here.

coveralls commented 8 months ago

Coverage Status

coverage: 93.671% (-0.05%) from 93.72% when pulling a7e1f2508dcd49acc56f99b63e9e427cf6b53e9c on djhoese:precommit-mypy into d8f45cd72ae3b8dff96c33d693c4c704dd53d63f on pytroll:main.

djhoese commented 8 months ago

@mraspaud @pnuu I think this is ready. Let me know what you think. I took all of the pre-commit configuration from pre-ruff satpy. The biggest issue/difficulty was mccabe complaining about code complexity which isn't surprising given the age of this package. The biggest "offenders" were the new-ish area_config.py stuff which I blame partially on me and my supervision of Will Roberts on adding it. I think this could probably use a rewrite.

The biggest problem was the get_sample_from_neighbour_info method. This method is...a problem. I got to the point that I couldn't refactor it the way it needed without risking breaking backwards compatibility. Even though we have plenty of tests, I don't trust them to the extent that this method needs to be refactored. As a shortcut/cheat, I mostly just put chunks of code into a function and called it. This meant having to create functions that have a ton of arguments. I think in the long run this functionality will be rewritten in a resampler class in the future and to support DataArrays, dask arrays, and numpy arrays so I'm not tempted to try harder on this. That said, I think here are some avenues that someone could go down if they wanted the code to be better:

  1. A helper class: Many of the arguments passed around are kind of like a "state" that is used by multiple areas. A single class with those as instance attributes might make the code look cleaner.
  2. Remove all specialization of "multi-channel" arrays and instead always shape input into a 1 or more channel array. Then refactor things into shared functions that don't have to do with the "certainty" or "weighted" functionality so that this function could be split into 3 separate functions (maybe) with each of these 3 cases: nearest, weighted, weighted + certainty.
mraspaud commented 8 months ago

Here's the ruff config that would mimic what we have with flake8 today:

[tool.ruff]
# See https://docs.astral.sh/ruff/rules/
select = ["E", "W", "B", "D", "T10", "C90",]
line-length = 120

[tool.ruff.pydocstyle]
convention = "google"

[tool.ruff.mccabe]
# Unlike Flake8, default to a complexity level of 10.
max-complexity = 10

for bandit, that would be "S", but not I'm unsure the functionality is actually the same.

djhoese commented 8 months ago

I could not check that the logic was preserved in most places, but trust this is mostly refactoring, right?

Right. There are only two places that I can think of where things were more heavily changed:

  1. This is still a refactor, but the legacy area parsing was changed to take advantage of the generator-ness of the file handle. The code is much more readable and tests pass.
  2. I changed a warning to an exception in the area_config.py parsing of YAML formats where someone types a degrees unit, but likely misspells it. It was detecting "deg" being in units but the overall word was not "degrees" or "deg". I could/should maybe just take that check out completely.
djhoese commented 8 months ago

And how about using ruff in place of flake8, and maybe even bandit?

I had decided not to use it because I did not want to go down the path of all of the rules that I know you worked on for Satpy. The example config you have looks easy enough so I'll try updating to that today.

However, ruff still doesn't handle blank lines between functions/methods and it is pretty annoying to me (it has been annoying to me to review Satpy PRs where there are 3 or more lines between functions). What do you think about waiting until that is handled to switch to ruff?

for bandit, that would be "S", but not I'm unsure the functionality is actually the same.

How so?

mraspaud commented 8 months ago

However, ruff still doesn't handle blank lines between functions/methods and it is pretty annoying to me (it has been annoying to me to review Satpy PRs where there are 3 or more lines between functions). What do you think about waiting until that is handled to switch to ruff?

oh, ok, I had missed that. That's pretty annoying indeed. We could also have ruff run in parallel with (bare) flake8 then maybe. But otherwise we can wait, yes.

for bandit, that would be "S", but not I'm unsure the functionality is actually the same.

How so?

The "S" rule for ruff is an implementation of flake8-bandit, not bandit itself. And flake8-bandit is calling bandit, so I suppose we'd better call bandit directly.

djhoese commented 7 months ago

Ok so CodeFactor is complaining about 11 things which were all existing TODOs in these modules. Codebeat on the other hand says 0 introduced issues and 125 resolved. Not bad.

djhoese commented 7 months ago

Merging this now and then I'll hope to resolve conflicts in @ghiggi's PR tonight.