biglocalnews / warn-scraper

Command-line interface for downloading WARN Act notices of qualified plant closings and mass layoffs from state government websites
https://warn-scraper.readthedocs.io
Apache License 2.0
29 stars 10 forks source link

Check formatting with pre-commit in continuous integration GitHub Actions #625

Closed chriszs closed 6 months ago

chriszs commented 6 months ago

While testing #623 I noticed pipenv run pre-commit run --all modified a bunch of files formatted with Black, isort, pyupgrade and it flagged a potential mypy issue (which seems to go away when I upgrade mypy). Apparently some linting and formatting issues have crept into the codebase, which could be because we're not enforcing pre-commit with a GitHub Action. Instead, there are steps for linting, which calls Flake8, and type checking, which calls mypy, but CI doesn't check Black, isort or pyuprade formatting.

It's a little bit of a mystery to me why the mypy step completes successfully when pre-commit mypy doesn't, but when I upgraded both the mypy pre-commit step and the mypy installed by Pipenv to 1.9.0 from 0.991 the supposed error went away (it seemed erroneous to begin with).

stucka commented 6 months ago

I've also run into problems I think with warn-transformer where the CI/CT routine running the same programs catches stuff that the command line doesn't.

Also have run into conflicts between these things, I think specifically around whitespace around operators.

It's infuriating and seems to harm more often than helping.

I could probably adapt the warn-transformer CI/CT routines here pretty easily, but I'm quite wary of introducing a blocker to actual production until I can get the old code cleaned up.

stucka commented 6 months ago

I want to make sure nothing else is going to be a blocker with moving this to production, but I hope to implement this soon. Thanks for calling it out, @chriszs !

stucka commented 6 months ago

(And also, it's almost certainly entirely my fault. I'm sorry!)

chriszs commented 6 months ago

I framed the issue and PR this way because I see it as a process issue, not anyone's personal failing, which does mean the solution is a little more far-reaching and harder to immediately accept, but should solve it once and for all (knock wood).

stucka commented 6 months ago

mypy is throwing quite a few errors visible with make lint that would need to get cleaned up before implementing CI. Sorted list below:


scrapers\co.py:46: error: Unexpected keyword argument "class_" for "find" of "str"  [call-arg]
scrapers\co.py:46: error: Value of type "Tag | NavigableString | int | Any | None" is not indexable  [index]
scrapers\co.py:47: error: Invalid index type "str" for "Tag | NavigableString | int | Any | None"; expected type "SupportsIndex | slice"  [index]
scrapers\co.py:60: error: Item "NavigableString" of "Tag | NavigableString | None" has no attribute "find_all"  [union-attr]
scrapers\co.py:60: error: Item "None" of "Tag | NavigableString | None" has no attribute "find_all"  [union-attr]
scrapers\ga.py:144: error: Item "NavigableString" of "Tag | NavigableString | None" has no attribute "find_all"  [union-attr]
scrapers\ga.py:144: error: Item "None" of "Tag | NavigableString | None" has no attribute "find_all"  [union-attr]
scrapers\hi.py:85: error: Item "None" of "Tag | None" has no attribute "prettify"  [union-attr]
scrapers\la.py:170: note: By default the bodies of untyped functions are not checked, consider using --check-untyped-defs  [annotation-unchecked]
scrapers\mt.py:41: error: Item "NavigableString" of "Tag | NavigableString | None" has no attribute "find_all"  [union-attr]
scrapers\mt.py:41: error: Item "None" of "Tag | NavigableString | None" has no attribute "find_all"  [union-attr]
scrapers\oh.py:48: error: Item "NavigableString" of "Tag | NavigableString | None" has no attribute "decode_contents"  [union-attr]
scrapers\oh.py:48: error: Item "None" of "Tag | NavigableString | None" has no attribute "decode_contents"  [union-attr]
scrapers\or.py:128: error: Argument 1 to "len" has incompatible type "bool | float | Decimal | str | CellRichText | date | time | timedelta | DataTableFormula | ArrayFormula"; expected "Sized"  [arg-type]
scrapers\or.py:48: error: Invalid index type "str" for "Tag | NavigableString | None"; expected type "SupportsIndex | slice"  [index]
scrapers\or.py:48: error: Value of type "Tag | NavigableString | None" is not indexable  [index]
scrapers\or.py:76: error: Invalid index type "str" for "Tag | NavigableString | None"; expected type "SupportsIndex | slice"  [index]
scrapers\or.py:76: error: Unsupported operand types for + ("str" and "list[str]")  [operator]
scrapers\or.py:76: error: Value of type "Tag | NavigableString | None" is not indexable  [index]
scrapers\or.py:76: note: Right operand is of type "str | list[str] | Any"
scrapers\or.py:98: error: Argument 1 to "len" has incompatible type "bool | float | Decimal | str | CellRichText | date | time | timedelta | DataTableFormula | ArrayFormula"; expected "Sized"  [arg-type]
scrapers\va.py:43: error: Invalid index type "str" for "Tag | NavigableString | None"; expected type "SupportsIndex | slice"  [index]
scrapers\va.py:43: error: Value of type "Tag | NavigableString | None" is not indexable  [index]
scrapers\wa.py:72: error: Value of type "Tag | NavigableString | None" is not indexable  [index]
scrapers\wa.py:74: error: Invalid index type "str" for "Tag | NavigableString | None"; expected type "SupportsIndex | slice"  [index]
scrapers\wa.py:75: error: Value of type "Tag | NavigableString | None" is not indexable  [index]
scrapers\wa.py:77: error: Invalid index type "str" for "Tag | NavigableString | None"; expected type "SupportsIndex | slice"  [index]```
chriszs commented 6 months ago

I was only able to reproduce after I added additional type stubs for Beautiful Soup 4 and openpyxl. Did an attempt to fix those issues, but it seems like tests are now failing. Have converted my PR to draft.

chriszs commented 6 months ago

Have now fixed some versioning issues and the PR is ready for review.