Closed kaxil closed 4 months ago
I've taken:
Might be an idea @kaxil to add a checklist for all rules needing updating in the initial issue comment. This way potential contributors can know which ones are done and which are still open to do? Just a quality of life idea so we don't have to look at the PRs individually.
@pcandoalmeida That's a good idea, I updated the description and added a check-box.
Hi @kaxil I'm working on D202 now.
Morning @kaxil what do you think about adding the Hacktoberfest
label to this issue?
Morning @kaxil what do you think about adding the
Hacktoberfest
label to this issue?
Good idea again, just added :)
Hi I wanna try D205. Can I try this ?
I started to fix D205 (1 blank line required between summary line and description). It's difficult because a lot of docs has not summary line.
For example, https://github.com/apache/airflow/blob/d86cf37a35b69fea806be7610deb332468dfba4b/airflow/configuration.py#L481-L486
This docs has no summary line and there are a lot of docs like this. I cannot make summary line for all these docs. If I deal all docs as summary line, the doc obey D205. But it is not easy to read.
How can I do that? If someone has any idea, please help me.
@ktrueda You can do 2 things:
1) Add a summary line:
Remove an option if it exists in config from a file or
default config.
If both of config have the same option, this removes
the option in both configs unless remove_default=False.
2) Only update blank line for which docstrings exists. In this case, we won't be able to enable D205 for all of them but when in future we have summary lines, it would make the word easier.
I would prefer (1) though
@kaxil Thank you for your help. I would prefer (1) too. So I'll try that way.
Since I found thousands of docs to fix, I need much times. Leave it to me.
@kaxil Can I take D200?
Hiya 👋🏼 @potix2 I believe D200 is currently being worked on 😃
@pcandoalmeida Yeah, I'm working on it. I'll send a pull request soon.
I'm working on D400.
I would like to tackle this issue if it hasn't been assigned recently. Thanks! @uranusjr :)
Go ahead 👍 Note that the ignore list above is out of date; please find the correct list in the pre-commit config file on main
.
status update:
many thanks!
Maybe try docformatter? You’d probably want to do this in small batches, maybe even one file at a time, to see how well it works.
Maybe try docformatter? You’d probably want to do this in small batches, maybe even one file at a time, to see how well it works.
sounds good. I will give docformatter
a shot. thank you for the tips. @uranusjr. 🙏
I'm working on D400.
hi @potix2 Do still wish to work on it?
@eladkal I gave up D400, because I got too many conflicts. I'm sorry for late reply.
@eladkal sorry I give up on D205. I will let other developers tackle this one 😞
@eladkal, is somebody working here: D400(First line should end with a period)? If not, I'd like to give it a try.
feel free!
@eladkal @kaxil I took D400, there are a lot of files there 😄 . My question is in this part where we have a long first line
Pre commits (D400) is asking to add a period at the end of the line, It is not the case, I think. Should I just remove the new line and convert it in a single line to avoid have that issue?
Simply removing the newline would trigger another check complaining the line is too long. It would be best to rewrite it and split the sentence into two shorter ones. This particular exampel can be rewritten to
Check metric values are within a certain tolerance.
The metrics are given as SQL expressions, and the tolerance with parameter *days_back*.
Simply removing the newline would trigger another check complaining the line is too long. It would be best to rewrite it and split the sentence into two shorter ones. This particular exampel can be rewritten to
Check metric values are within a certain tolerance. The metrics are given as SQL expressions, and the tolerance with parameter *days_back*.
Hard to automate I am a afraid :(
@potiuk @uranusjr
I counted all the changes that are needed for D400, they are 3150 changes. Finish this in a single PR could be hard. I got an idea. How about to keep ignoring D400 in pre-commits - --add-ignore=D100,D102,D103,D104,D105,D107,D202,D400,D401,D205
and still working in minitaks (100 or 200 files) and send PR fixing what @uranusjr is saying? I could make progress in this way. What do you think?
@edithturn SGTM. The smaller the changes are, the easier it will be to review and merge.
Yep
Thanks, it is better!
@edithturn Still working on D400?
@bdsoha D400 is huge it involved many files. help is appreciated.
@eladkal I'm always glad to help, just want to make sure there is no overlap. Is there a way to concretely subdivide the work?
@bdsoha thank you for asking. I found 3150 changes to make and I tried to do it by parts. If you find another way to make this easier let me know so I can also jump in. Appreciate it 🚀
@bdsoha thank you for asking. I found 3150 changes to make and I tried to do it by parts. If you find another way to make this easier let me know so I can also jump in. Appreciate it 🚀
I want to know more or less what you are currently working on so that we don't both attempt work on the same resources.
yay! My last change was pushed :)
So I am not working on this right now.
The approach that was following was:
Add this validation D400 to the pre-commits, evaluate all the files, and start fixing them by parts. Before sending the PR, again remove D400 from pre-commits so that it lets us push the changes :)
Let me know if that helps!
@edithturn Just needed to know that you are not working on it simultaneously. Thanks!
@kaxil @potiuk @eladkal Once the above related PR are merged, it might be worth it to add the following to guarantee conforming to styles on already handled directories:
# airflow/.pre-commit-config.yaml
- repo: https://github.com/pycqa/pydocstyle
rev: 6.1.1
hooks:
- id: pydocstyle
name: Run pydocstyle
args:
- --convention=pep257
- --add-ignore=D100,D102,D103,D104,D105,D107,D205,D400,D401
include: |
^airflow/api.*\.py$|
^airflow/api_connexion.*\.py$|
^airflow/callbacks.*\.py$|
^airflow/cli.*\.py$|
^airflow/datasets.*\.py$|
^airflow/decorators.*\.py$|
^airflow/exceptions.py$|
^airflow/hooks.*\.py$|
^airflow/lineage.*\.py$|
^airflow/listeners.*\.py$|
^airflow/macros.*\.py$|
^airflow/security.*\.py$|
^airflow/sensors.*\.py$|
^airflow/tasks.*\.py$|
^airflow/timetables.*\.py$|
^airflow/triggers.*\.py$
@bdsoha Can you create a PR with your suggestion
You mark D400 as done
Given that https://github.com/apache/airflow/pull/28893 removed pydocstyle
in favor of ruff
Do we have a way to enforce styles?
D400 is completed but we have no way to make sure new code is comply with D400 convention.
D400 is completed but we have no way to make sure new code is comply with D400 convention.
We do. It is part of static checks, if your changes do not comply with D400 convention, static check will fail
There are almost a thousand D205 violations, I've taken a bite out of that. 59 down, 878 to go! :stuck_out_tongue:
Let's hope they go down faster than new are coming :D
Most are easy. Every docstring must start with a one-line summary which ends in a period. If there is more than one sentence then there must be a blank line after the first. It gets a little tricky for docstrings like Does the thing. If it fails then it does this.
because you have to merge that into one sentence which fits on one line or rephrase it and add a gap. Also, sometimes the docstring IS one sentence but its a long sentence so you either have to rephrase to shorten it or come up with a way that it has a summary and a details block without looking too redundant and silly. :shrug:
It just takes some time.
After the above batch of five commits get merged we can throw the switch on D205! Locally with all of those applied, I show zero D205 violations in the codebase.
Starting on D401. There are over 2000 violations on that one so it'll be the same many-small-bites process.
@ferruzzi was D401 completed?
I guess not yet, at least if I uncomment this line https://github.com/apache/airflow/blob/f24a03709eecbda87ed794cee567806e51c3a21f/pyproject.toml#L1276
And run check manually I've got an error:
Run 'ruff' for extremely fast Python linting.............................Failed
- hook id: ruff
- exit code: 1
airflow/auth/managers/utils/fab.py:43:5: D401 First line of docstring should be in imperative mood: "Returns the map associating a method to a FAB action."
airflow/auth/managers/utils/fab.py:48:5: D401 First line of docstring should be in imperative mood: "Returns the map associating a FAB action to a method."
... (truncated about 1000 lines)
airflow/www/auth.py:90:5: D401 First line of docstring should be in imperative mood: "This decorator is used to check permissions on views."
airflow/www/auth.py:348:5: D401 First line of docstring should be in imperative mood: "Decorator that checks current user's permissions to access the website."
airflow/www/blueprints.py:27:5: D401 First line of docstring should be in imperative mood: "Main Airflow page."
airflow/www/views.py:4067:9: D401 First line of docstring should be in imperative mood: "Action method to handle multiple records selected from a list view."
Found 1084 errors.
I assume it is pretty hard to implement it until https://github.com/astral-sh/ruff/issues/3172 not implemented.
As solution for that we might enable D401 by default in ruff and add current files/module which not yet follow D401 into the tool.ruff.per-file-ignores
. WDYT?
As solution for that we might enable D401 by default in ruff and add current files/module which not yet follow D401 into the tool.ruff.per-file-ignores. WDYT?
Probably this is the best way to avoid pile up more work.
We use PyDocStyle in pre-commit to enforce docstring styleWe use Ruff to enforce docsting style https://github.com/apache/airflow/blob/23b8e839a35e84d57c5cf38b0b21171ac3bd1ecc/pyproject.toml#L47-L74
We follow pep257 style (http://www.pydocstyle.org/en/stable/error_codes.html) for checks.
Currently, we ignore the following rules:
The task is to complete the following missing rules:
Missing Docstrings
__init__
Whitespace Issues
Docstring Content Issues
It would be good if we can enable them one by one -- separate PRs are ok
Let me know if you need any help