allenai / mmda

multimodal document analysis
Apache License 2.0
158 stars 18 forks source link

Egork/flake8 #171

Open egork520 opened 1 year ago

egork520 commented 1 year ago

Refactored the code to follow PEP8 style guide

Adding flake8 command to CICD pipeline

egork520 commented 1 year ago

In general, I am a little bit scared of such a giant PR touching so many files. Many of the changes make the code a lot less readable than before, and it would be good to manually refactor where necessary.

I agree and should probably asked before opening. Mix of styles in different parts of the code does not please my eyes.

Some questions:

1. Is this a flavor of PEP8 we like?

Some non-standard things I noticed:

  1. 120 chars lines It is up for discussion, the resolution of the screens I personally prefer longer lines.

  2. ok with mix of single and double quotes Personally I am used to single quotes unless double is needed. It is up for discussion.

  3. non-multiple indentation allowed

  4. new lines on binary operators

As an example, I went with the following in smashed:

[tool.black]
line-length = 79
include = '\.pyi?$'
exclude = '''
(
      __pycache__
    | \.git
    | \.mypy_cache
    | \.pytest_cache
    | \.vscode
    | \.venv
    | \bdist\b
    | \bdoc\b
)
'''

[tool.isort]
profile = "black"
line_length = 79
multi_line_output = 3

[tool.autopep8]
max_line_length = 79
in-place = true
recursive = true
aggressive = 3

[tool.mypy]
python_version = 3.8
ignore_missing_imports = true
no_site_packages = true
allow_redefinition = false

[tool.mypy-tests]
strict_optional = false

[tool.flake8]
exclude = [
    ".venv/",
    "tmp/"
]
per-file-ignores = [
    '*.py:E203',
    '__init__.py:F401',
    '*.pyi:E302,E305'
]

(Note that the pyproject.toml above requires flake8-pyi and Flake8-pyproject to run properly)

We could also align with other AI2 projects are using, e.g. Tango's.

Agree, might be worth formalizing style requirements for s2? Can even be part of 3 year vision plan (sharpen the saw)

2. we should probably adopt some auto-formatting tools we run before release

Again, in smashed I have a combo of flake8, isort, and autopep8 I ask contributors to run black . && flake8 . &&o isort .

3. Should we adopt mypy too?

Probably not immediately? But again, other projects use it.

4. Some of the automatic refactor kills semantics of comments

I left a note in a couple of places where this happens.