Open JanetMatsen opened 1 month ago
I tried to push a (one-line) fix to this, but don't seem to have permission to push.
(venv) ➜ woodwork git:(issue1872-bugfix_in_describe_percentile) ✗
git push origin issue1872-bugfix_in_describe_percentile
ERROR: Permission to alteryx/woodwork.git denied to JanetMatsen.
fatal: Could not read from remote repository.
Please make sure you have the correct access rights
and the repository exists.
The problem
There is a
.iat
missing in the definition ofpercentile
that is used by.ww.describe()
(see here).Reproducible example, using
woodwork==0.31.0
This demonstrates the bug:
The error is
KeyError: 1
because it's trying to look up index 1 in thea
Series. (Here I've set it to3
).The bug comes from the woodwork definition of percentile for a pandas series:
Run this example by setting
series = df['a']
and runningpercentile()
directly.Result:
Note that in cases when int(k) happens to be a value in the index it will give you an answer, but there is no reason it will actually correspond to the intended position!
If I'd had
then
percentile(N=df2['a'], percent=0.25, count=len(series))
gives5
for the 25th percentile 😱The fix
The fix is quite simple. Add a
.iat
like the lines below have.Now it works: