Open datapythonista opened 5 years ago
I made a pull request #28340 to solve "pandas.Series.astype: Unknown parameters {kwargs}". I'll continue working on more PR02 issues in docstrings.
I made a PR to fix pandas.Series.clip: Unknown parameters {*args, **kwargs}.
I've been working on this today, and it seems some of this is being caused by decorators. For instance, in strings.py, when the @forbid_nonstring_types decorator is commented out on one of the offending methods, it passes without the error. I've also seen this behavior on the @deprecate_kwarg decorator. I'm not sure what the fix should be yet, I'd have to look into it a bit more.
Hi @Nabel0721 . I'm interested in knowing what you find. Thanks
@WuraolaOyewusi So, it looks like the function signature_parameters
in the validation script needs some work. After inspecting one of the offending functions, I found this:
It looks like the getfullargspec
function is missing parts of the signature. I think this is because the signature
function says that it unwraps wrapping chains (layers of decorators), while getfullargspec
does not.
If this is the issue, it would solve some of the errors we have, but likely introduce more as we catch more arguments in other function.
@Nabel0721 . I will be following to learn how this is sorted. Please tag me as you make changes. Have a great day
@WuraolaOyewusi I put together a PR (#28765) that switches the script to use signature
. According to the documentation for it, signature
finds the source function by checking if a function has the __wrapped__
attribute. If it does, it contains the function that was wrapped by the current one. This is automatically set by the @wraps
decorator and other similar functions from the functools
library. So, if we're having issues with a decorator overriding the signature of a function, we need to check if it has the @wraps
decorator inside, or if it's a class-made decorator if it has something like functools.update_wrapper(self, func)
. This is only necessary if we need the signature of the underlying function. For CPython, decorators should probably have the __wrapped__
attribute assigned directly.
Ok Abel. Thanks for letting me know. I will look for the PR and continue to see how the whole process works.
On Fri, 4 Oct 2019, 6:11 PM Nathan Abel, notifications@github.com wrote:
@WuraolaOyewusi https://github.com/WuraolaOyewusi I put together a PR that switches the script to use signature. According to the documentation for it, signature finds the source function by checking if a function has the wrapped attribute. If it does, it contains the function that was wrapped by the current one. This is automatically set by the @wraps decorator and other similar functions from the functools library. So, if we're having issues with a decorator overriding the signature of a function, we need to check if it has the @wraps decorator inside, or if it's a class-made decorator if it has something like functools.update_wrapper(self, func). This is only necessary if we need the signature of the underlying function. For CPython, decorators should probably have the wrapped attribute assigned directly.
— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/pandas-dev/pandas/issues/27976?email_source=notifications&email_token=AKDK7XA47MA5TOHPPY77RQ3QM52L7A5CNFSM4IMQON4KYY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOEAMJUCA#issuecomment-538483208, or mute the thread https://github.com/notifications/unsubscribe-auth/AKDK7XAUPDI3QQ6RJEFFLWTQM52L7ANCNFSM4IMQON4A .
@datapythonista Hey I checked out this issue and would like to help. So right now if I look at the list and say see the last entry which is - pandas.tseries.offsets.CBMonthBegin.apply_index: Unknown parameters {i} What exactly do I have to do I went to the website and found this https://pandas.pydata.org/pandas-docs/version/0.25.3/reference/api/pandas.tseries.offsets.CBMonthBegin.apply_index.html Here the parameters seem to be fine
Also for some random example like pandas.core.window.rolling.Rolling.apply: Unknown parameters {*args, *kwargs} I can't seem to understand what it's trying to say. Does it mean it had unknown Parameters earlier and the right parameters are {args, **kwargs}?
Also as far as I could see most of them have parameters already present when I open the website to look at them. Is there an updated list as well?
The link that you're sharing is for an old version of pandas. You should check the latest version at: https://pandas.pydata.org/docs/dev/
There are two parameters for any function:
def my_function(actual_parameter):
"""
Does something.
Parameters
----------
unknown_parameter : str
This is unknown, and actual_parameter is missing, there are two incorrect things here.
"""
pass
So, an unknown parameter as you can see in the example is a documented parameter that we don't know about, because it's not an actual parameter in the signature of the function.
ohkk understood thanks @datapythonista
@datapythonista Hi, I've checked out pandas.DataFrame.sparse.from_spmatrix: Unknown parameters {index, columns}
. I had a look at the code I found this:
@classmethod
def from_spmatrix(cls, data, index=None, columns=None):
"""
Create a new DataFrame from a scipy sparse matrix.
.. versionadded:: 0.25.0
Parameters
----------
data : scipy.sparse.spmatrix
Must be convertible to csc format.
index, columns : Index, optional
Row and column labels to use for the resulting DataFrame.
Defaults to a RangeIndex.
Returns
-------
DataFrame
...
etc
"""
Should I assume this is wrong and change it to a separate row for index and columns? Or is this correct and ./scripts/validate_docstrings.py --errors=PR02
should have passed this docstring?
Ideally we would like to allow document two columns at a time. So, updating the script so index, columns
is accepted and doesn't raise an error would be the preferred option. I'm not sure how often we are using that format of two columns at once. If it's very rarely, unpacking them and documenting them separately would also be fine.
pandas.DataFrame.plot.pie
requires one of either a column reference y
, or subplots=True
, calling it without returns a ValueError. However, neither appears in the parameters of its function signature def pie(self, **kwargs):
.
This requirement is mentioned in the extended summary but from the pandas docstring guide,
The extended summary provides details on what the function does. It should not go into the details of the parameters, or discuss implementation notes, which go in other sections.
How should the docstring be modified in this case or is this an issue with the validation script?
Since the signature contains only **kwargs
, so this is the only parameter that can be documented. In the description of the parameter you can explain about those accepted kwargs
.
I think in this case it could make sense to change the signature of the function to something like pie(self, y=None, subplots=True, **kwargs)
. Maybe there is a reason why it's not like this, but I think that should work and be more explicit. If you want to open a PR for that, it would be great (or if you prefer to open an issue, that would be good too).
take
When a user runs the command python ./scripts/validate_docstrings.py --errors=PR02 in a terminal they are presented with a list of errors with the docstrings for functions, such as what is shown in the errors.txt file (I created). In the errors' list, many errors point to the same line of code, such as /home/pandas/pandas/core/indexes/extension.py:92. However, when you look at the line in the file, line 92 def method(self, *args, **kwargs) it seems to just be defining a function within another wrapper function. Are these errors supposed to be at the same location or should the validate doc-strings only check the wrapper function's doc-string and not the inner function?
We did our best to point to the docstring, but many docstrings are build in a dynamic way, and it's not possible to know where the actual docstring content is defined. This seems to be the case here, so just ignore that information. You can better use grep with some sentence of the docstring you're looking for to see where it's defined.
hi @datapythonista, should this issue be closed in favor of https://github.com/pandas-dev/pandas/issues/58063, which is more up-to-date and clearer instruction about how to fix docstring errors?
Fix the docstrings where an unknown parameter is documented (likely that they are typos, errors in the parameter format...).
Current errors: