Closed ghost closed 4 months ago
@AFg6K7h4fhy2 are you up for doing this?
@AFg6K7h4fhy2 are you up for doing this?
Yes, sounds good.
Situation is the following:
numpydoc
numpydoc
to the ./model
poetry environment.pre-commit-config.yaml
in ./
- repo: https://github.com/numpy/numpydoc
rev: v1.6.0
hooks:
- id: numpydoc-validation
which works:
__init__.py
, but the exclude/ignore system is what I am having trouble with. The numpydoc
pre-commit link affords this starter for the pyproject.toml
file[tool.numpydoc_validation]
checks = [
"EX01",
"SA01",
"ES01",
]
exclude = [ # don't report on objects that match any of these regex
'\.undocumented_method$',
'\.__repr__$',
'\.__init__$',
]
override_SS05 = [ # override SS05 to allow docstrings starting with these words
'^Process ',
'^Assess ',
'^Access ',
]
pyproject.toml
files in both pipeline
and model
. I then added '\.__init__$',
to the exclude list in both .toml
files. I then activated the shell environments and ran pre-commit run --all-files
to see if the added directives worked. They did not seem to work in the sense I got the same numpydoc-validation
pre-commit flags. I will keep exploring getting the exclusion directives correctly before attempting to merge the branches. I will also try to address the numpydoc
suggested changes. Alternatively, I can comment out the numpydoc
section in the .pre-commit-config.yaml
file and then merge, so that @gvegayoncdc can manipulate which flags to exclude (this seems less likely to produce wasted effort).
@gvegayoncdc @dylanhmorris
This will be a nice tool, but as I said earlier, it is heavy lifting! I suggest giving priority to the sphinx website before working more in this.
While using numpydocs
in MSR is bottlenecked by 47, I tried playing around with numpydocs
in my sandbox, but was unable to properly ignore some checks, despite trying for some time. Of course, in a mature pipeline, I expect to configure numpydocs
deliberately, instead of catch-all ignoring, but I still want to understand how to do this, without using a toml
configuration file. For visibility, I will detail the process here.
See the following links for more details:
The last link provides an answer, though I expect there is still a way to do what I want without having to use configuration in my pyproject.toml
file.
The correct pre-commit
specification (as of 2024-04-12) is this:
- repo: https://github.com/numpy/numpydoc
rev: v1.7.0
hooks:
- id: numpydoc-validation
which returns the following in my sandbox (the last few lines)
Note also that for help one can run
python3 -m numpydoc.hooks.validate_docstrings --help
I am interested in the ignoring clause of this documentation:
I have tried the following (each args
constituted a different attempt) under id
in the pre-commit
specification, yet only the first pattern works (element 1 and 2).
args = ['--ignore=GL08']
args = ['--ignore=EX01']
args = ['--ignore=GL08,EX01']
args = ['--ignore=GL08', '--ignore=EX01']
args = ['--ignore', 'GL08', '--ignore', 'EX01']
This was the pre-commit
specification:
In
- repo: https://github.com/numpy/numpydoc
rev: v1.7.0
hooks:
- id: numpydoc-validation
args: ['--ignore=GL08']
So, how does one ignore multiple checks in the above specification?
Anyhow, this is what one can do within their toml
file to exclude, EX01
, GL08
, SA01
.
[tool.numpydoc_validation]
checks = [
"all",
"EX01",
"GL08",
"SA01"
]
There is no urgency or real need with answering how to ignore checks within the pre-commit specification, I am simply curious, especially since it seems so easy with other pre-commit hooks such as with ruff
, e.g.
- repo: https://github.com/astral-sh/ruff-pre-commit
rev: v0.3.6
hooks:
- id: ruff
args: ['--ignore=E741', '--ignore=E731']
Have you asked AI to help you with that?
Have you asked AI to help you with that?
Yes, after trying the above, I did use the CDC AI tool, but it kept wanting me to create my own pre-commit hook instead. Given numpydocs
low version and (relatively) lower star count, I think this is a situation similar to polars
, i.e. one where there is a package representation dearth in the training data. Thank you for the prompting though.
Regardless, MSR can stick with toml
configuration for numpydocs
.
Pasting this here so we don't have to look at the site as often
ERROR_MSGS = {
"GL01": "Docstring text (summary) should start in the line immediately "
"after the opening quotes (not in the same line, or leaving a "
"blank line in between)",
"GL02": "Closing quotes should be placed in the line after the last text "
"in the docstring (do not close the quotes in the same line as "
"the text, or leave a blank line between the last text and the "
"quotes)",
"GL03": "Double line break found; please use only one blank line to "
"separate sections or paragraphs, and do not leave blank lines "
"at the end of docstrings",
"GL05": 'Tabs found at the start of line "{line_with_tabs}", please use '
"whitespace only",
"GL06": 'Found unknown section "{section}". Allowed sections are: '
"{allowed_sections}",
"GL07": "Sections are in the wrong order. Correct order is: {correct_sections}",
"GL08": "The object does not have a docstring",
"GL09": "Deprecation warning should precede extended summary",
"GL10": "reST directives {directives} must be followed by two colons",
"SS01": "No summary found (a short summary in a single line should be "
"present at the beginning of the docstring)",
"SS02": "Summary does not start with a capital letter",
"SS03": "Summary does not end with a period",
"SS04": "Summary contains heading whitespaces",
"SS05": "Summary must start with infinitive verb, not third person "
'(e.g. use "Generate" instead of "Generates")',
"SS06": "Summary should fit in a single line",
"ES01": "No extended summary found",
"PR01": "Parameters {missing_params} not documented",
"PR02": "Unknown parameters {unknown_params}",
"PR03": "Wrong parameters order. Actual: {actual_params}. "
"Documented: {documented_params}",
"PR04": 'Parameter "{param_name}" has no type',
"PR05": 'Parameter "{param_name}" type should not finish with "."',
"PR06": 'Parameter "{param_name}" type should use "{right_type}" instead '
'of "{wrong_type}"',
"PR07": 'Parameter "{param_name}" has no description',
"PR08": 'Parameter "{param_name}" description should start with a '
"capital letter",
"PR09": 'Parameter "{param_name}" description should finish with "."',
"PR10": 'Parameter "{param_name}" requires a space before the colon '
"separating the parameter name and type",
"RT01": "No Returns section found",
"RT02": "The first line of the Returns section should contain only the "
"type, unless multiple values are being returned",
"RT03": "Return value has no description",
"RT04": "Return value description should start with a capital letter",
"RT05": 'Return value description should finish with "."',
"YD01": "No Yields section found",
"SA01": "See Also section not found",
"SA02": "Missing period at end of description for See Also "
'"{reference_name}" reference',
"SA03": "Description should be capitalized for See Also "
'"{reference_name}" reference',
"SA04": 'Missing description for See Also "{reference_name}" reference',
"EX01": "No examples section found",
}
For which checks to retain, I've decided to include, as a baseline, the following:
ERROR_MSGS = {
"GL03": "Double line break found; please use only one blank line to "
"separate sections or paragraphs, and do not leave blank lines "
"at the end of docstrings",
"GL08": "The object does not have a docstring",
"SS01": "No summary found (a short summary in a single line should be "
"present at the beginning of the docstring)",
"PR01": "Parameters {missing_params} not documented",
"PR02": "Unknown parameters {unknown_params}",
"PR03": "Wrong parameters order. Actual: {actual_params}. "
"Documented: {documented_params}",
"PR04": 'Parameter "{param_name}" has no type',
"PR07": 'Parameter "{param_name}" has no description',
"RT01": "No Returns section found",
}
For functions in which there is no need for such checks (e.g., some validate()
methods), I have manually flagged (e.g. via # numpydoc ignore=EX01,SA01,ES01
) not to have the checks performed.
Goal
To enforce the usage of
numpydoc
for documenting functions, we can leverage a pre-commit hook designed for this: https://numpydoc.readthedocs.io/en/latest/validation.html. This could be a heavy lift!Context
All issues tagged with
documentation
.Specifications