nbQA-dev / nbQA

Run ruff, isort, pyupgrade, mypy, pylint, flake8, and more on Jupyter Notebooks
https://nbqa.readthedocs.io/en/latest/index.html
MIT License
1.04k stars 41 forks source link

bug: Wrong flake imported from environment #665

Closed nicain closed 3 years ago

nicain commented 3 years ago

This is a very strange bug: it appears as if the introspection that allows running flake8 from nbqa is pulling the flake8 module from the wrong location (another project) instead of the flake8 that is installed by pip.

This had me confused for a long time, but I think this is a pretty deep problem with how the CLI is driving flake8

Steps to reproduce from a clean virtualenv:

virtualenv venv
source activate venv/bin/activate.sh
pip install parlai==1.5.0     # See comment (1) below
git clone git@github.com:GoogleCloudPlatform/vertex-ai-samples.git    # See comment (2) below
cd vertex-ai-samples
git checkout 97ced04adb62ae63c1132942ff1bafece558926a  # See comment (3) below
python3 -m pip install -U -r .github/workflows/linter/requirements.txt.   
python3 -m nbqa flake8 '/usr/local/google/home/nicholascain/vertex-ai-samples/notebooks/official/automl/automl-text-classification.ipynb' --show-source --extend-ignore=W391,E501,F821,E402,F404,W503,E203,E722 --nbqa-mutate   # See comment (4) below

(1) parlai (at version 1.5.0) has a module called flake8.py; this will be mistaken for the intended flake8 later on, which is why this weird dependency is installed (2) This is the repo that I was working on, when I noticed the issue, so the minimal path to reproducing the problem consistently (3) This repo is under active development, so I am pinning this set of bug-reproduction instructions here so that it doesn't change in the future (4) The result of this run demonstrates the problem. When I execute this line, I see:

# %%NBQA-CELL-SEP
^
/usr/local/google/home/nicholascain/vertex-ai-samples/notebooks/official/automl/automl-text-classification.ipynb:cell_1:0:1: PAI201 Missing copyright `Copyright (c) Facebook, Inc. and its affiliates.`
# %%NBQA-CELL-SEP
^
/usr/local/google/home/nicholascain/vertex-ai-samples/notebooks/official/automl/automl-text-classification.ipynb:cell_1:1:1: PAI202 Missing copyright `This source code is licensed under the MIT license found in the`
# Copyright 2021 Google LLC
^
/usr/local/google/home/nicholascain/vertex-ai-samples/notebooks/official/automl/automl-text-classification.ipynb:cell_1:2:1: PAI203 Missing copyright `LICENSE file in the root directory of this source tree.`
#
^

The part about "Facebook" had me totally confused, because I couldn't fund any reference to this flake8 directive (PAI201) anywhere except parlai. The flake8 source code has no reference to Facebook either. But the flake8.py module inside of parlai does; this tells me that nbqa is grabbing the wrong flake when it introspects the environment to run the cli command.

Weird. Good luck!

MarcoGorelli commented 3 years ago

nbqa will just call python -m flake8, whatever that points to is what it'll use

Second, you're using a version of nbQA that's pretty old, I'd suggest upgrading

Finally, seeing as you're using nbQA at Google, please consider sponsoring the project

MarcoGorelli commented 3 years ago

Just checked, and parlai actually installs a flake8 extension https://github.com/facebookresearch/ParlAI/blob/main/setup.py

That's where the PAI code comes from, and this is working as expected

So, this isn't a bug, and the only deep problem here is that Google is using free tools without contributing a penny back, despite it being one of the richest companies in the world