Open llucax opened 10 months ago
Hi @llucax , I think this is a good idea, and I'll implement it.
However, there's one nuance: pydocstyle and subsequently Ruff check the writing style of docstrings, and there will be a D
violation if the initial word of a docstring follows the "third-person singular present tense" rule. For example:
Calculates the average of a list of numbers.
(There is an "s" at the end of "Calculates".)
Instead, pydoclint requires people to write in imperative mood:
Calculate the average of a list of numbers
(Note: there is no "s" at the end of "Calculate".)
Since pydocstyle is very widely used, I think in my implementation, the criterion of omitting the Returns
/Yields
section will be: if the initial word of a docstring is return/Return/yield/Yield (note: no "s").
Yeah, Google allows for both styles:
The docstring may be descriptive-style ("""Fetches rows from a Bigtable.""") or imperative-style ("""Fetch rows from a Bigtable."""), but the style should be consistent within a file.
So I think pydoclint
should also allow for both. If you are using pydocstyle
or ruff
(we are), then you are stuck with only being able to use the "imperative-style", that's, at the end, the user's choice.
This is also why I used Raise(s)/Yield(s)
in the title :)
Again, thinking about interactions with #111 and #112, I suggest:
--allow-missing-return-if-starts-with-return
: If a Returns
section is missing, it won't complain.--allow-missing-yield-if-starts-with-yield
: If a Yields
section is missing, it won't complain.After some thought, I have decided not to pursue the proposed changes.
The main reason is similar to the other two issues (#111 and #112): this kind of ad hoc rules ("allow omitting XXX if YYY") is not very easy to remember, and is almost opaque to code maintainers (except for the actual person who set this kind of configs).
For example, if I implement this "special config" and you set it, in the near future, someone who updates the repo may update the docstring, so that its first sentence doesn't start with "Return". And all of a sudden, pydoclint starts to complain. Unless that person is yourself, it could take that person a very long time to find the root cause.
For example, if I implement this "special config" and you set it, in the near future, someone who updates the repo may update the docstring, so that its first sentence doesn't start with "Return". And all of a sudden, pydoclint starts to complain. Unless that person is yourself, it could take that person a very long time to find the root cause.
I don't agree with this, the reason should be pretty obvious if the repository is following google style. This is not a personal weird use case, is a very widely used style that is supposed to be supported by the tool, right?
So maybe the implementation (the options and semantics) I'm proposing are not the ones you like the most, but I guess properly supporting google style is a goal for this project, or is it not?
I´d be fine to just having one option --google-style
that just adapt the checks to comply with google style, that would even be the easiest option for users, at the cost of losing flexibility to support other custom mixed styles. But that's not my case, I just want to be able to comply with google style, and I was hoping for this project to help me with that :)
In my honest opinion, I don't think the Google style guide (at least the docstring part) is very well designed. It allows too many "ad hoc" exceptions, which can introduce a lot of communication overhead and decrease productivity.
Here is a concrete example.
def func_1() -> int:
"""
Return the result.
"""
pass
def func_2() -> int:
"""
Get the result.
"""
pass
def func_3() -> int:
"""
Asynchronously return the result.
"""
pass
def func_4() -> int:
"""
Wait 5 seconds and then return the result.
"""
pass
If I implement this exception that you are suggesting, func_1
will pass without any violations, but the other functions will all produce violations.
And imagine a team member needs to resolve the CI pipeline issue, but this member is not all that familiar with pydoclint nor the Google Style guide. This person would be so confused and would waste a lot of time, either asking teammates, posting an Issue here, or Google some answers.
This is not a personal weird use case, is a very widely used style that is supposed to be supported by the tool, right?
@llucax : can you find a lot of examples where people omit the return section when the docstring starts with "Return" or "Yield"?
If there are enough examples, I could still (begrudgingly) consider implementing it, even though I still think this could promote coding habits that increases communication overhead (which I outlined above).
In my honest opinion, I don't think the Google style guide (at least the docstring part) is very well designed.
I don't mean to be disrespectful, on the contrary I'm super thankful about this project and the work that you do, but my opinion is that this is not very relevant. Is like saying that you think HTTP is not well designed so you'll implement it differently. That's not helpful, and people won't be able to use your tool.
I think you either should say that Google style is not supported, or you should support it, even the parts that you don't like.
If I implement this exception that you are suggesting,
func_1
will pass without any violations, but the other functions will all produce violations.
This is incorrect, func_2
is also valid, Google style says:
It may also be omitted if the docstring starts with “Return”, “Returns”, “Yield”, or “Yields” [...] and the opening sentence is sufficient to describe the return value.
(emphasis by me)
So using "Returns ..." is optional. Well, func_2
probably should fail because it is missing Returns:
unless unchecked short docstrings is enabled. And omitting Returns:
is optional too, you can still add a Returns:
even if the docstrings starts with Returns ...
, as it says you can omit it only if the opening sentence is sufficient to describe the return value.
func_3
won't pass the check for pydocstyle
BTW, so my feeling is your example is a bit convoluted and probably not very likely in reality, and again at the end, it doesn't matter, if one is following some standard and uses a tool to check for the standard, the tool should just comply with the standard, no matter how inconsistent it might be.
@llucax : can you find a lot of examples where people omit the return section when the docstring starts with "Return" or "Yield"?
Yes, in our we use it (or would like to use it) very often. A typical example is __str__()
and the many dunder methods that don't take any arguments and return something.
If there are enough examples, I could still (begrudgingly) consider implementing it, even though I still think this could promote coding habits that increases communication overhead (which I outlined above).
I would assume most Google open source projects using Python probably use it. Just picking up the first in the list and looking for Returns:
I can find a few examples:
Returns ...
and Returns:
https://github.com/google/jax/blob/f13a795c833e8812c7ee1e27ba0bb1f0bfc26fb6/jax/_src/clusters/cluster.py#L82-L92)(I didn't check if the project really strictly follows google style guide though, I'm just assuming it is because it's a Google project, but on a quick look it seems it is)
The examples you provided are not entirely valid.
In this example, there is no documentation for the input argument either, indicating that this is only a "lazy docstring". (In a "lazy docstring", people don't need to add the Args and Returns section anyways.)
And this example actually does the opposite in proving your point -- it starts with "Returns" but also has a "return" section in the docstring.
I don't believe that a linter such as pydoclint should simply check 100% what the style guide says, especially when the style guide promotes bad coding practice (such as this "Returns/Yields" rule in the Google style guide).
I can leave this issue open for other people to easily find. If a lot of users need this feature, I can implement it.
In this example, there is no documentation for the input argument either, indicating that this is only a "lazy docstring". (In a "lazy docstring", people don't need to add the Args and Returns section anyways.)
The missing Args:
is an error, the missing Returns:
is not, that's allowed by google style. And maybe they made that mistake because they don't have a tool to check for the docstrings :wink:
And this example actually does the opposite in proving your point -- it starts with "Returns" but also has a "return" section in the docstring.
What is my point, this is also valid google style, it says you are allowed to omit the Returns:
in that case, not that you are forced to. Actually you must put the Returns:
too even if the docstring starts with Returns ...
if the summary line is not clear enough about what the return is. But of course the semantics and judging if the summary line is enough or not is very subjective and impossible to check in a linting tool.
I don't believe that a linter such as pydoclint should simply check 100% what the style guide says
Why not? I mean ideally, I agree it might not be practical to do so, as with the example I mention above about checking if a summary is enough or not to avoid the Returns:
, but why wouldn't you check everything that can be checked? Is not the goal of such a tool to make sure the standard is followed as close as possible? If that's not the goal, then what's the goal of a tool that checks for a particular documentation style?
, especially when the style guide promotes bad coding practice (such as this "Returns/Yields" rule in the Google style guide).
Again, this is your personal opinion. For example I don't agree, I think there is a balance and google style hits that balance pretty well, this is why we chose it. Again, with all due respect, I don't think you should judge if a style is good or bad and support it only partially because you personally don't like some of the rules. If the goal of your projects is to help users, you are not doing so if you do this, you are just forcing users to go find alternatives that supports google style properly.
Again, if you don't like google style and say you don't want to support it, that's fine, but IMHO you should be clear about it so users are not mislead about the tool supporting google style.
So, for me it is pretty clear that you don't want to implement this out of personal preference, and I'm fine with that. I only have a couple of final question:
Again, thanks a lot for the great tool and for having the patience to reply to all my comments, I know it must be time consuming :heart:
Suppose we are to implement this logic, here is the behavior I have in mind:
Any other situations will result in a style violation. I don't think pydoclint should just "not check something", because this kind of bad/wrong situation will go undetected:
def my_function(arg1: str) -> str:
"""
Return a string.
Parameters
-----------
arg1 : str
The argument to the function
Returns
--------
float
The value to return
"""
(The docstring starts with "Return", so we skip checking the return section, but the return section has incorrect info, leading to confusion in the code.)
Is this the same as what you have in mind?
If so, any help in submitting PRs from you is very welcome.
I agree if a Returns:
section is present, no matter if it is "optional" (when this option is enabled and the docstring starts with Return ...
) or not, then types should be checked. You don't have to skip checking the return section, you just have allow for the section to not exist at all. So you look for the return section, it is not there? Then if the google style option is off, you fail, if it is on, you succeed. If the section is there, you don't even look at the google style option, you just check it as usual. That would be my approach to the problem to be as compatible with google style as possible.
I believe there was a similar case, where you don't require types to be specified in the docs, but if they are there, they will be checked anyway, is not like you disable all checks.
If so, any help in submitting PRs from you is very welcome.
:+1:
So you look for the return section, it is not there? Then if the google style option is off, you fail, if it is on, you succeed. If the section is there, you don't even look at the google style option, you just check it as usual.
This logic is not that intuitive nor very easy to remember. But it works. Some documentation is definitely needed to explain this logic.
For me it is super clear and simple, is just making the section optional, so I guess it is a subjective matter 🙃
Google coding style says about the
Returns
section:For now this can be achieved by using the --skip-checking-short-docstrings but it is not exactly the same, as by using this option
Args
,Raises
and other sections are also not required, which an lead to a lot of unintended missing documentation.It would be good to have an option to make
pydoclint
explicitly support this exception allowed by the Google style (and make it default to enabled when google style is used).