For compatibility reasons, we should accept all of the kind values that Pandas does in DataFrame.sort_values, but not necessarily use them. Right now, doing
import cudf
a = [0,1,2]
b = [-3, 2, 0]
df = cudf.DataFrame()
df["a"] = a
df["b"] = b
df.sort_values(by='b')
uses Quicksort by default. When I try it with a different sorting algorithm, I get:
df.sort_values(by='b', kind='mergesort')
---------------------------------------------------------------------------
NotImplementedError Traceback (most recent call last)
/tmp/ipykernel_59179/3536691751.py in <module>
----> 1 df.sort_values(by='b', kind='mergesort')
~/miniconda3/envs/cudf_dev/lib/python3.8/site-packages/cudf/core/dataframe.py in sort_values(self, by, axis, ascending, inplace, kind, na_position, ignore_index)
3907 if kind != "quicksort":
3908 print(kind)
-> 3909 raise NotImplementedError("`kind` not currently implemented.")
3910 if axis != 0:
3911 raise NotImplementedError("`axis` not currently implemented.")
NotImplementedError: `kind` not currently implemented.
For compatibility reasons, we should accept all of the
kind
values that Pandas does inDataFrame.sort_values
, but not necessarily use them. Right now, doinguses Quicksort by default. When I try it with a different sorting algorithm, I get:
Pandas currently allows
kind{‘quicksort’, ‘mergesort’, ‘heapsort’, ‘stable’}, default ‘quicksort’
, as found here: https://pandas.pydata.org/pandas-docs/stable/reference/api/pandas.DataFrame.sort_values.html.