Add mypy and deptry as pre-commit hooks

aaccioly commented 1 year ago

Currently mypy is not running automatically (neither locally nor as part of GitHub's pipeline).

A user has to manually run:

make check

If he wants to trigger type checking.

deptry runs as part of the pipeline, but, IMO, it should be also run as part a pre-commit hook as catching dependency problems after they have been introduced is not great .

Proposed solution

Add both tools to pre-commit

Additional context

I've just recently adopted cookiecutter-poetry (thanks a million for it) and butchered some type hints.

I was under the impression that mypy was running automatically but unfortunately this wasn't the case. Later I've realised that there's a dedicated make target for linting, but had to read the Makefile to find it out (see #79).

As both mypy and deptry are able to run very fast I would have no problem running both of them on every commit.

fpgmaas commented 1 year ago

Thanks for raising the issue! Sorry that this flaw in cookiecutter-poetry allowed you to butcher your typehints. I think the issue here is that mypy is triggered in the tox tests, since it should be verified for the different versions of Python; see here in the example repository. I assume that you have tox disabled, which causes the mypy checks to be skipped.

This is of course not the desired behavior. An approach to solve this is to run the mypy tests in tox when tox="y", and in the main.yml workflow otherwise.

Alternatively, we could simply runmake check in the main.yml workflow. This does mean that for one Python version, the mypy checks are ran twice, but it is easier to implement and perhaps more clear to users of the repository.

I am not a big fan of adding both tools to pre-commit, since that also includes it's own set of issues and workarounds, see e.g. here.

Regarding deptry; that should always be run within the main workflow; see here.

I will give the above a bit more thought, curious for your thoughts as well. I will try to pick this up over the weekend.

aaccioly commented 1 year ago

Hi fpgmas, thanks for the prompt feedback and the good will on random requests.

I assume that you have tox disabled, which causes the mypy checks to be skipped.

Bingo! I wrongly assumed that tox was only there to test against different versions of Python, but now that I think about all of the recent evolution in Python's typing I can see why you want to run it with tox.

Alternatively, we could simply runmake check in the main.yml workflow. This does mean that for one Python version, the mypy checks are ran twice, but it is easier to implement and perhaps more clear to users of the repository.

I agree, it's fast enough than running it a second time in the pipeline isn't a big deal.

I will give the above a bit more thought, curious for your thoughts as well.

Your point about mypy is fair. ruff suffers with similar problems when checking one file at a time without access to dependencies (see https://github.com/charliermarsh/ruff/issues/1220).

I took a shotgun approach and am running all tools against all files on every change. It's really faster than one may think (ruff is blazing fast, mypy has good caching policies, folks are pretty much used to running black on every change and you likely have a more educated opinion tabout deptry than my own- but running it on every commit is not bothering me at all).

My solution:

For my own project initialised with your template here's what I did (and what seems to be working very well):

Updated mypy and deptry + add ruff and black to poetry as dev dependencies

[tool.poetry.group.dev.dependencies]
pytest = "^7.2.0"
pytest-cov = "^4.0.0"
deptry = "^0.8.0"
mypy = "^1.0.0"
pre-commit = "^2.20.0"
tox = "^3.25.1"
black = "^23.1.0"
ruff = "^0.0.246"

Change check target to run every code quality tool against every file

.PHONY: check
check: ## Run code quality tools.
    @echo "🚀 Checking Poetry lock file consistency with 'pyproject.toml': Running poetry lock --check"
    @poetry lock --check
    @echo "🚀 Linting code: Running ruff"
    @poetry run ruff check .
    @echo "🚀 Formatting code: Running black"
    @poetry run black .
    @echo "🚀 Static type checking: Running mypy"
    @poetry run mypy
    @echo "🚀 Checking for obsolete dependencies: Running deptry"
    @poetry run deptry .

Pre-commit runs `make check` (only once per commit)

repos:
  - repo: local
    hooks:
      - id: check
        name: Run code quality tools
        language: system
        entry: make check
        pass_filenames: false
        always_run: true
        verbose: true

Removed extra steps from the pipeline as pre-commit will take care of everything

jobs:
  quality:
    runs-on: ubuntu-latest
    steps:
      - name: Check out
        uses: actions/checkout@v3

      - uses: actions/cache@v3
        with:
          path: ~/.cache/pre-commit
          key: pre-commit-${{ hashFiles('.pre-commit-config.yaml') }}

      - name: Set up the environment
        uses: ./.github/actions/setup-poetry-env

      - name: Run pre-commit
        run: poetry run pre-commit run -a --show-diff-on-failure

Is working brilliant for me. Fast enough and overcomes the problems with mypy and ruff mentioned above. It also shifts concerns left and allows me to handle typing and linting issues before opening a PR.

PS: I have disabled fix = true for ruff, but this is a matter of preference. One can also let ruff fix stuff for them, although I would personally want to run tests before committing one of ruff's automatic fixes. A middle ground alternative would be to run ruff check --fix --exit-non-zero-on-fix).

fpgmaas commented 1 year ago

Thanks for your elaborate and clear answer! I think this also a very good project set-up that I could see myself using. However, I am a bit hesitant to apply this way of working to cookiecutter-poetry, since running all tools against all files on every change may not work very well for everyone. For example, I can see this becoming an issue for large Python projects. There might also just be a bit of hesitancy from my side since I have not battle-tested this approach myself :)

So if you think the following solution is also okay;

Alternatively, we could simply runmake check in the main.yml workflow. This does mean that for one Python version, the mypy checks are ran twice, but it is easier to implement and perhaps more clear to users of the repository.

I'd prefer to implement that.

However, if you have a fork of the repo adapted to your approach, I'd be happy to mention it in the README.md to make other users aware of the alternative!

aaccioly commented 1 year ago

I can see this becoming an issue for large Python projects. There might also just be a bit of hesitancy from my side since I have not battle-tested this approach myself

This if fair enough. I assumed that this would be case (and hence why I wrote a long comment instead of a PR :)).

I'd prefer to implement that.

I think that this would be the next best alternative yes. The tradeoff is that folks doing trunk base development would still need to run mypy manually (or find out about problems after commiting - which isn't ideal). However, given the choice between running fast but semi-broken mypy as a pre-commit plugin vs running it properly in the CI/CD pipeline I guess that your instinct is correct.

if you have a fork of the repo

I've implemented all of the above on a fully initialized repo instead of a fork of the Cookiecutter template. I also don't think that the changes are significant enough to deserve their own fork. Tell you what, if you are really interested in the approach described above (as in, not only trying to be nice to me🤣) what do you think about creating a branch? Say shift-left-linters. I'll submit a PR to it and we'll try to keep it up-to-date with main for a while. If it ends up working well, we can later think of merging it to main hidden behind a non-default flag. What do you think?

fpgmaas commented 1 year ago

@aaccioly I have thought about the proposal above; and even though I think it can definitely be useful for some users, I prefer to not add it as a non-default flag to this repo. The main reason for that is that I find it difficult to properly test all the combinations of configurations (which also caused this issue), which is why I keep this repository simple and a a bit opinionated and I try to limit the configuration possibilities. Adding this feature will double the number of combinations to be tested, and I don't think I can support that this keeps working with any new changes going forward.

I hope that is okay with you!

If this issue gains more traction in the future, or more people voice their demand for such a configuration, I'd be happy to reconsider.

fpgmaas / cookiecutter-poetry