jsh9 / pydoclint

A Python docstring linter that checks arguments, returns, yields, and raises sections
https://pypi.org/project/pydoclint/
MIT License
146 stars 15 forks source link

Google style: Allow omitting only the `Returns`/`Yields` if summary starts with `Return(s)/Yield(s)` #110

Open llucax opened 10 months ago

llucax commented 10 months ago

Google coding style says about the Returns section:

It may also be omitted if the docstring starts with Returns or Yields (e.g. """Returns row from Bigtable as a tuple of strings.""") and the opening sentence is sufficient to describe the return value.

For now this can be achieved by using the --skip-checking-short-docstrings but it is not exactly the same, as by using this option Args, Raises and other sections are also not required, which an lead to a lot of unintended missing documentation.

It would be good to have an option to make pydoclint explicitly support this exception allowed by the Google style (and make it default to enabled when google style is used).

jsh9 commented 10 months ago

Hi @llucax , I think this is a good idea, and I'll implement it.

However, there's one nuance: pydocstyle and subsequently Ruff check the writing style of docstrings, and there will be a D violation if the initial word of a docstring follows the "third-person singular present tense" rule. For example:

Calculates the average of a list of numbers.

(There is an "s" at the end of "Calculates".)

Instead, pydoclint requires people to write in imperative mood:

Calculate the average of a list of numbers

(Note: there is no "s" at the end of "Calculate".)

Since pydocstyle is very widely used, I think in my implementation, the criterion of omitting the Returns/Yields section will be: if the initial word of a docstring is return/Return/yield/Yield (note: no "s").

llucax commented 10 months ago

Yeah, Google allows for both styles:

The docstring may be descriptive-style ("""Fetches rows from a Bigtable.""") or imperative-style ("""Fetch rows from a Bigtable."""), but the style should be consistent within a file.

So I think pydoclint should also allow for both. If you are using pydocstyle or ruff (we are), then you are stuck with only being able to use the "imperative-style", that's, at the end, the user's choice.

This is also why I used Raise(s)/Yield(s) in the title :)

llucax commented 10 months ago

Again, thinking about interactions with #111 and #112, I suggest:

jsh9 commented 9 months ago

After some thought, I have decided not to pursue the proposed changes.

The main reason is similar to the other two issues (#111 and #112): this kind of ad hoc rules ("allow omitting XXX if YYY") is not very easy to remember, and is almost opaque to code maintainers (except for the actual person who set this kind of configs).

For example, if I implement this "special config" and you set it, in the near future, someone who updates the repo may update the docstring, so that its first sentence doesn't start with "Return". And all of a sudden, pydoclint starts to complain. Unless that person is yourself, it could take that person a very long time to find the root cause.

llucax commented 9 months ago

For example, if I implement this "special config" and you set it, in the near future, someone who updates the repo may update the docstring, so that its first sentence doesn't start with "Return". And all of a sudden, pydoclint starts to complain. Unless that person is yourself, it could take that person a very long time to find the root cause.

I don't agree with this, the reason should be pretty obvious if the repository is following google style. This is not a personal weird use case, is a very widely used style that is supposed to be supported by the tool, right?

So maybe the implementation (the options and semantics) I'm proposing are not the ones you like the most, but I guess properly supporting google style is a goal for this project, or is it not?

I´d be fine to just having one option --google-style that just adapt the checks to comply with google style, that would even be the easiest option for users, at the cost of losing flexibility to support other custom mixed styles. But that's not my case, I just want to be able to comply with google style, and I was hoping for this project to help me with that :)

jsh9 commented 9 months ago

In my honest opinion, I don't think the Google style guide (at least the docstring part) is very well designed. It allows too many "ad hoc" exceptions, which can introduce a lot of communication overhead and decrease productivity.

Here is a concrete example.

def func_1() -> int:
    """
    Return the result.
    """
    pass

def func_2() -> int:
    """
    Get the result. 
    """
    pass

def func_3() -> int:
    """
    Asynchronously return the result. 
    """
    pass

def func_4() -> int:
    """
    Wait 5 seconds and then return the result. 
    """
    pass

If I implement this exception that you are suggesting, func_1 will pass without any violations, but the other functions will all produce violations.

And imagine a team member needs to resolve the CI pipeline issue, but this member is not all that familiar with pydoclint nor the Google Style guide. This person would be so confused and would waste a lot of time, either asking teammates, posting an Issue here, or Google some answers.

jsh9 commented 9 months ago

This is not a personal weird use case, is a very widely used style that is supposed to be supported by the tool, right?

@llucax : can you find a lot of examples where people omit the return section when the docstring starts with "Return" or "Yield"?

If there are enough examples, I could still (begrudgingly) consider implementing it, even though I still think this could promote coding habits that increases communication overhead (which I outlined above).

llucax commented 9 months ago

In my honest opinion, I don't think the Google style guide (at least the docstring part) is very well designed.

I don't mean to be disrespectful, on the contrary I'm super thankful about this project and the work that you do, but my opinion is that this is not very relevant. Is like saying that you think HTTP is not well designed so you'll implement it differently. That's not helpful, and people won't be able to use your tool.

I think you either should say that Google style is not supported, or you should support it, even the parts that you don't like.

llucax commented 9 months ago

If I implement this exception that you are suggesting, func_1 will pass without any violations, but the other functions will all produce violations.

This is incorrect, func_2 is also valid, Google style says:

It may also be omitted if the docstring starts with “Return”, “Returns”, “Yield”, or “Yields” [...] and the opening sentence is sufficient to describe the return value.

(emphasis by me)

So using "Returns ..." is optional. Well, func_2 probably should fail because it is missing Returns: unless unchecked short docstrings is enabled. And omitting Returns: is optional too, you can still add a Returns: even if the docstrings starts with Returns ..., as it says you can omit it only if the opening sentence is sufficient to describe the return value.

func_3 won't pass the check for pydocstyle BTW, so my feeling is your example is a bit convoluted and probably not very likely in reality, and again at the end, it doesn't matter, if one is following some standard and uses a tool to check for the standard, the tool should just comply with the standard, no matter how inconsistent it might be.

llucax commented 9 months ago

@llucax : can you find a lot of examples where people omit the return section when the docstring starts with "Return" or "Yield"?

Yes, in our we use it (or would like to use it) very often. A typical example is __str__() and the many dunder methods that don't take any arguments and return something.

If there are enough examples, I could still (begrudgingly) consider implementing it, even though I still think this could promote coding habits that increases communication overhead (which I outlined above).

I would assume most Google open source projects using Python probably use it. Just picking up the first in the list and looking for Returns: I can find a few examples:

(I didn't check if the project really strictly follows google style guide though, I'm just assuming it is because it's a Google project, but on a quick look it seems it is)

jsh9 commented 9 months ago

The examples you provided are not entirely valid.

In this example, there is no documentation for the input argument either, indicating that this is only a "lazy docstring". (In a "lazy docstring", people don't need to add the Args and Returns section anyways.)

And this example actually does the opposite in proving your point -- it starts with "Returns" but also has a "return" section in the docstring.

I don't believe that a linter such as pydoclint should simply check 100% what the style guide says, especially when the style guide promotes bad coding practice (such as this "Returns/Yields" rule in the Google style guide).

I can leave this issue open for other people to easily find. If a lot of users need this feature, I can implement it.

llucax commented 9 months ago

In this example, there is no documentation for the input argument either, indicating that this is only a "lazy docstring". (In a "lazy docstring", people don't need to add the Args and Returns section anyways.)

The missing Args: is an error, the missing Returns: is not, that's allowed by google style. And maybe they made that mistake because they don't have a tool to check for the docstrings :wink:

And this example actually does the opposite in proving your point -- it starts with "Returns" but also has a "return" section in the docstring.

What is my point, this is also valid google style, it says you are allowed to omit the Returns: in that case, not that you are forced to. Actually you must put the Returns: too even if the docstring starts with Returns ... if the summary line is not clear enough about what the return is. But of course the semantics and judging if the summary line is enough or not is very subjective and impossible to check in a linting tool.

I don't believe that a linter such as pydoclint should simply check 100% what the style guide says

Why not? I mean ideally, I agree it might not be practical to do so, as with the example I mention above about checking if a summary is enough or not to avoid the Returns:, but why wouldn't you check everything that can be checked? Is not the goal of such a tool to make sure the standard is followed as close as possible? If that's not the goal, then what's the goal of a tool that checks for a particular documentation style?

, especially when the style guide promotes bad coding practice (such as this "Returns/Yields" rule in the Google style guide).

Again, this is your personal opinion. For example I don't agree, I think there is a balance and google style hits that balance pretty well, this is why we chose it. Again, with all due respect, I don't think you should judge if a style is good or bad and support it only partially because you personally don't like some of the rules. If the goal of your projects is to help users, you are not doing so if you do this, you are just forcing users to go find alternatives that supports google style properly.

Again, if you don't like google style and say you don't want to support it, that's fine, but IMHO you should be clear about it so users are not mislead about the tool supporting google style.

So, for me it is pretty clear that you don't want to implement this out of personal preference, and I'm fine with that. I only have a couple of final question:

  1. Would accept PRs to implement this or is it something you would reject even if implemented by other people?
  2. Specially if the answer to 1 is you wouldn't accept this change even if done by other people, would you consider at least clarifying in the README/docs that google style is only supported partially and will never be supported in full because you find it promote bad practices?

Again, thanks a lot for the great tool and for having the patience to reply to all my comments, I know it must be time consuming :heart:

jsh9 commented 9 months ago

Suppose we are to implement this logic, here is the behavior I have in mind:

Any other situations will result in a style violation. I don't think pydoclint should just "not check something", because this kind of bad/wrong situation will go undetected:

def my_function(arg1: str) -> str:
    """
    Return a string.

    Parameters
    -----------
    arg1 : str
        The argument to the function

    Returns
    --------
    float
        The value to return
    """

(The docstring starts with "Return", so we skip checking the return section, but the return section has incorrect info, leading to confusion in the code.)

Is this the same as what you have in mind?

If so, any help in submitting PRs from you is very welcome.

llucax commented 9 months ago

I agree if a Returns: section is present, no matter if it is "optional" (when this option is enabled and the docstring starts with Return ...) or not, then types should be checked. You don't have to skip checking the return section, you just have allow for the section to not exist at all. So you look for the return section, it is not there? Then if the google style option is off, you fail, if it is on, you succeed. If the section is there, you don't even look at the google style option, you just check it as usual. That would be my approach to the problem to be as compatible with google style as possible.

I believe there was a similar case, where you don't require types to be specified in the docs, but if they are there, they will be checked anyway, is not like you disable all checks.

If so, any help in submitting PRs from you is very welcome.

:+1:

jsh9 commented 9 months ago

So you look for the return section, it is not there? Then if the google style option is off, you fail, if it is on, you succeed. If the section is there, you don't even look at the google style option, you just check it as usual.

This logic is not that intuitive nor very easy to remember. But it works. Some documentation is definitely needed to explain this logic.

llucax commented 8 months ago

For me it is super clear and simple, is just making the section optional, so I guess it is a subjective matter 🙃