rapidsai / cudf

cuDF - GPU DataFrame Library
https://docs.rapids.ai/api/cudf/stable/
Apache License 2.0
8.49k stars 907 forks source link

[BUG] get_group raises with length-1 tuple when grouping by length-1 list #17187

Closed MarcoGorelli closed 1 month ago

MarcoGorelli commented 1 month ago

Describe the bug

import cudf
df =cudf.DataFrame({'a': [1,2,3], 'b': [4,5,6]})
df.groupby(['a']).get_group((1,))

Steps/Code to reproduce bug this raises

---------------------------------------------------------------------------
KeyError                                  Traceback (most recent call last)
Cell In[2], line 3
      1 import cudf
      2 df =cudf.DataFrame({'a': [1,2,3], 'b': [4,5,6]})
----> 3 df.groupby(['a']).get_group((1,))

File /opt/conda/lib/python3.10/site-packages/cudf/utils/performance_tracking.py:51, in _performance_tracking.<locals>.wrapper(*args, **kwargs)
     43 if nvtx.enabled():
     44     stack.enter_context(
     45         nvtx.annotate(
     46             message=func.__qualname__,
   (...)
     49         )
     50     )
---> 51 return func(*args, **kwargs)

File /opt/conda/lib/python3.10/site-packages/cudf/core/groupby/groupby.py:481, in GroupBy.get_group(self, name, obj)
    474 else:
    475     warnings.warn(
    476         "obj is deprecated and will be removed in a future version. "
    477         "Use ``df.iloc[gb.indices.get(name)]`` "
    478         "instead of ``gb.get_group(name, obj=df)``.",
    479         FutureWarning,
    480     )
--> 481 return obj.iloc[self.indices[name]]

KeyError: (1,)

Expected behavior

   a  b
0  1  4

which is what pandas does

in fact, pandas issues a deprecation warning if you do .get_group(1), advising to use .get_group((1,)) instead

spotted in narwhals: https://github.com/narwhals-dev/narwhals/pull/1259

Environment details Please run and paste the output of the cudf/print_env.sh script here, to gather any other relevant environment details

cudf version '24.10.01'

Additional context Add any other context about the problem here.

vyasr commented 1 month ago

Thanks for reporting this Marco!