pandas-dev / pandas

Flexible and powerful data analysis / manipulation library for Python, providing labeled data structures similar to R data.frame objects, statistical functions, and much more
https://pandas.pydata.org
BSD 3-Clause "New" or "Revised" License
42.57k stars 17.57k forks source link

DOC: Enforce Numpy Docstring Validation | pandas.ExcelFile through pandas.HDFStore #58067

Open jordan-d-murphy opened 3 months ago

jordan-d-murphy commented 3 months ago

DOC: Enforce Numpy Docstring Validation (Parent Issue) #58063

Pandas has a script for validating docstrings in code_checks.sh. Currently, some methods fail some of these checks.

pandas.ExcelFile through pandas.HDFStore

https://github.com/pandas-dev/pandas/blob/c468028f5c2398c04d355cef7a8b6a3952620de2/ci/code_checks.sh#L168-L181

The task is:

  1. take 1-5 methods

  2. run: scripts/validate_docstrings.py --format=actions <method-name>

example command: scripts/validate_docstrings.py --format=actions pandas.Categorical.__array__ example output:

################################################################################
################################## Validation ##################################
################################################################################

2 Errors found for `pandas.Categorical.__array__`:
    ES01    No extended summary found
    SA01    See Also section not found
  1. check if validation docstrings passes for those methods, and if it’s necessary fix the docstrings according to whatever error is reported. Note: We've chosen to ignore ES01 errors, these are not required to be fixed.

  2. remove those methods from code_checks.sh if all errors are cleared and the docstring is correct, otherwise, remove the specific error that was fixed from the list of errors for that method.

  3. commit, push, open pull request

Please don't comment take as multiple people can work on this issue. You also don't need to ask for permission to work on this, just comment on which methods are you going to work : )

If you're new contributor, please check the contributing guide

thanks @datapythonista for the inspiration for this issue!

tuhinsharma121 commented 2 months ago

I am sorry for commenting take. I have unassigned myself. I am working on the following

 -i "pandas.ExcelFile PR01,SA01" \ 
 -i "pandas.ExcelFile.parse PR01,SA01" \ 
 -i "pandas.ExcelWriter SA01" 
tuhinsharma121 commented 2 months ago

I am workin on the following now

 -i "pandas.Float32Dtype SA01" \ 
 -i "pandas.Float64Dtype SA01" \ 
tuhinsharma121 commented 2 months ago

I am working on the following

 -i "pandas.Grouper PR02,SA01" \ 
tuhinsharma121 commented 2 months ago

working on following

 -i "pandas.HDFStore.append PR01,SA01" \ 
 -i "pandas.HDFStore.get SA01" \ 
tuhinsharma121 commented 2 months ago

working on following

 -i "pandas.HDFStore.groups SA01" \ 
 -i "pandas.HDFStore.info RT03,SA01" \ 
 -i "pandas.HDFStore.keys SA01" \ 
tuhinsharma121 commented 2 months ago

working on following

 -i "pandas.HDFStore.put PR01,SA01" \ 
 -i "pandas.HDFStore.select SA01" \ 
 -i "pandas.HDFStore.walk SA01" \ 
tuhinsharma121 commented 2 months ago

@mroeschke Only -i "pandas.Grouper PR02" remains in this issue. I dont see how it can be solved. xref - https://github.com/pandas-dev/pandas/pull/58273/files/9e565132a73fcf365aed3435138ab25a05da8b9e#r1567818846

mroeschke commented 2 months ago

I think the appropriate change would to make the Grouper signature to not accept args and kwargs and have permanent arguments

tuhinsharma121 commented 2 months ago

makes sense. let me work on that. created an issue to track this #58388