Flexible and powerful data analysis / manipulation library for Python, providing labeled data structures similar to R data.frame objects, statistical functions, and much more
[X] I have checked that this issue has not already been reported.
[X] I have confirmed this bug exists on the latest version of pandas.
[ ] I have confirmed this bug exists on the main branch of pandas.
Reproducible Example
import pandas as pd
import pyarrow as pa
df = pd.DataFrame(
{
"a": list(range(10)) * 2,
"b":list(range(20,30)) * 2,
"c":list(range(50,70)),
},
dtype=pd.ArrowDtype(pa.string())
)
result = df2.groupby(["a", "b"], as_index=False)["c"].rank(method="first", ascending=False, na_option="bottom")
Issue Description
This is similar to #51996. When grouping a dataframe and applying the rank function on a column with data type string[pyarrow] or large_string[pyarrow] I get the following error:
TypeError: rank is not supported for string[pyarrow] dtype
or
TypeError: rank is not supported for large_string[pyarrow] dtype
respectively
Expected Behavior
I would expect rank to work for string[pyarrow] just as it works for the pandas "string" dtype, using lexicographic ordering.
Pandas version checks
[X] I have checked that this issue has not already been reported.
[X] I have confirmed this bug exists on the latest version of pandas.
[ ] I have confirmed this bug exists on the main branch of pandas.
Reproducible Example
Issue Description
This is similar to #51996. When grouping a dataframe and applying the rank function on a column with data type string[pyarrow] or large_string[pyarrow] I get the following error:
TypeError: rank is not supported for string[pyarrow] dtype
orTypeError: rank is not supported for large_string[pyarrow] dtype
respectivelyExpected Behavior
I would expect rank to work for string[pyarrow] just as it works for the pandas "string" dtype, using lexicographic ordering.
Installed Versions