rapidsai / cudf

cuDF - GPU DataFrame Library
https://docs.rapids.ai/api/cudf/stable/
Apache License 2.0
8.41k stars 899 forks source link

[BUG] Unexpected errors for groupby agg on empty dask-cudf dataframe #14200

Closed charlesbluca closed 11 months ago

charlesbluca commented 1 year ago

Describe the bug I'm encountering unexpected DataErrors and RuntimeErrors when attempting groupby aggregations on an empty dask-cudf dataframe that I do not encounter with 23.08; it's not immediately obvious if this change was intentional, as these generally seem like groupby aggs that should be somewhat trivial.

Steps/Code to reproduce bug

import cudf
import dask_cudf

df = cudf.DataFrame(
    columns=['a', 'b', 'c'],
    dtype={
        'a': 'int64',
        'b': 'datetime64[ns]',
        'c': 'object'})

ddf = dask_cudf.from_cudf(df, npartitions=1)

ddf.groupby(['a']).agg({'b': 'count'}).compute()  # RuntimeError: CUDF failure at: /opt/conda/conda-bld/work/cpp/src/groupby/groupby.cu:166: Invalid type/aggregation combination.
ddf.groupby(['a']).agg({'c': 'count'}).compute()  # DataError: All requested aggregations are unsupported

Expected behavior I would expect the same behavior that occurred using dask-cudf 23.08, which in this case returned the respective empty dataframe results of the above groupby aggs.

Environment overview (please complete the following information)

Environment details

Click here to see environment details

     **git***
print_env.sh: 11: [: true: unexpected operator
     Not inside a git repository

     ***OS Information***
     DGX_NAME="DGX Server"
     DGX_PRETTY_NAME="NVIDIA DGX Server"
     DGX_SWBUILD_DATE="2023-03-27-13-31-04"
     DGX_SWBUILD_VERSION="5.5.0"
     DGX_COMMIT_ID="b2e06e0"
     DGX_PLATFORM="DGX Server for DGX-1"
     DGX_SERIAL_NUMBER="QTFCOU8220020"
     DISTRIB_ID=Ubuntu
     DISTRIB_RELEASE=20.04
     DISTRIB_CODENAME=focal
     DISTRIB_DESCRIPTION="Ubuntu 20.04.6 LTS"
     NAME="Ubuntu"
     VERSION="20.04.6 LTS (Focal Fossa)"
     ID=ubuntu
     ID_LIKE=debian
     PRETTY_NAME="Ubuntu 20.04.6 LTS"
     VERSION_ID="20.04"
     HOME_URL="https://www.ubuntu.com/"
     SUPPORT_URL="https://help.ubuntu.com/"
     BUG_REPORT_URL="https://bugs.launchpad.net/ubuntu/"
     PRIVACY_POLICY_URL="https://www.ubuntu.com/legal/terms-and-policies/privacy-policy"
     VERSION_CODENAME=focal
     UBUNTU_CODENAME=focal
     Linux dgx13 5.4.0-159-generic #176-Ubuntu SMP Mon Aug 14 12:04:20 UTC 2023 x86_64 x86_64 x86_64 GNU/Linux

     ***GPU Information***
     Tue Sep 26 11:13:57 2023
     +-----------------------------------------------------------------------------+
     | NVIDIA-SMI 525.125.06   Driver Version: 525.125.06   CUDA Version: 12.0     |
     |-------------------------------+----------------------+----------------------+
     | GPU  Name        Persistence-M| Bus-Id        Disp.A | Volatile Uncorr. ECC |
     | Fan  Temp  Perf  Pwr:Usage/Cap|         Memory-Usage | GPU-Util  Compute M. |
     |                               |                      |               MIG M. |
     |===============================+======================+======================|
     |   0  Tesla V100-SXM2...  On   | 00000000:06:00.0 Off |                    0 |
     | N/A   32C    P0    42W / 300W |      0MiB / 32768MiB |      0%      Default |
     |                               |                      |                  N/A |
     +-------------------------------+----------------------+----------------------+
     |   1  Tesla V100-SXM2...  On   | 00000000:07:00.0 Off |                    0 |
     | N/A   30C    P0    41W / 300W |      0MiB / 32768MiB |      0%      Default |
     |                               |                      |                  N/A |
     +-------------------------------+----------------------+----------------------+
     |   2  Tesla V100-SXM2...  On   | 00000000:0A:00.0 Off |                    0 |
     | N/A   30C    P0    41W / 300W |      0MiB / 32768MiB |      0%      Default |
     |                               |                      |                  N/A |
     +-------------------------------+----------------------+----------------------+
     |   3  Tesla V100-SXM2...  On   | 00000000:0B:00.0 Off |                    0 |
     | N/A   28C    P0    41W / 300W |      0MiB / 32768MiB |      0%      Default |
     |                               |                      |                  N/A |
     +-------------------------------+----------------------+----------------------+
     |   4  Tesla V100-SXM2...  On   | 00000000:85:00.0 Off |                    0 |
     | N/A   31C    P0    42W / 300W |      0MiB / 32768MiB |      0%      Default |
     |                               |                      |                  N/A |
     +-------------------------------+----------------------+----------------------+
     |   5  Tesla V100-SXM2...  On   | 00000000:86:00.0 Off |                    0 |
     | N/A   29C    P0    41W / 300W |      0MiB / 32768MiB |      0%      Default |
     |                               |                      |                  N/A |
     +-------------------------------+----------------------+----------------------+
     |   6  Tesla V100-SXM2...  On   | 00000000:89:00.0 Off |                    0 |
     | N/A   33C    P0    42W / 300W |      0MiB / 32768MiB |      0%      Default |
     |                               |                      |                  N/A |
     +-------------------------------+----------------------+----------------------+
     |   7  Tesla V100-SXM2...  On   | 00000000:8A:00.0 Off |                    0 |
     | N/A   29C    P0    40W / 300W |      0MiB / 32768MiB |      0%      Default |
     |                               |                      |                  N/A |
     +-------------------------------+----------------------+----------------------+

     +-----------------------------------------------------------------------------+
     | Processes:                                                                  |
     |  GPU   GI   CI        PID   Type   Process name                  GPU Memory |
     |        ID   ID                                                   Usage      |
     |=============================================================================|
     |  No running processes found                                                 |
     +-----------------------------------------------------------------------------+

     ***CPU***
     Architecture:                       x86_64
     CPU op-mode(s):                     32-bit, 64-bit
     Byte Order:                         Little Endian
     Address sizes:                      46 bits physical, 48 bits virtual
     CPU(s):                             80
     On-line CPU(s) list:                0-79
     Thread(s) per core:                 2
     Core(s) per socket:                 20
     Socket(s):                          2
     NUMA node(s):                       2
     Vendor ID:                          GenuineIntel
     CPU family:                         6
     Model:                              79
     Model name:                         Intel(R) Xeon(R) CPU E5-2698 v4 @ 2.20GHz
     Stepping:                           1
     CPU MHz:                            3338.660
     CPU max MHz:                        3600.0000
     CPU min MHz:                        1200.0000
     BogoMIPS:                           4389.94
     Virtualization:                     VT-x
     L1d cache:                          1.3 MiB
     L1i cache:                          1.3 MiB
     L2 cache:                           10 MiB
     L3 cache:                           100 MiB
     NUMA node0 CPU(s):                  0-19,40-59
     NUMA node1 CPU(s):                  20-39,60-79
     Vulnerability Gather data sampling: Not affected
     Vulnerability Itlb multihit:        KVM: Mitigation: Split huge pages
     Vulnerability L1tf:                 Mitigation; PTE Inversion; VMX conditional cache flushes, SMT vulnerable
     Vulnerability Mds:                  Mitigation; Clear CPU buffers; SMT vulnerable
     Vulnerability Meltdown:             Mitigation; PTI
     Vulnerability Mmio stale data:      Mitigation; Clear CPU buffers; SMT vulnerable
     Vulnerability Retbleed:             Not affected
     Vulnerability Spec store bypass:    Mitigation; Speculative Store Bypass disabled via prctl and seccomp
     Vulnerability Spectre v1:           Mitigation; usercopy/swapgs barriers and __user pointer sanitization
     Vulnerability Spectre v2:           Mitigation; Retpolines, IBPB conditional, IBRS_FW, STIBP conditional, RSB filling, PBRSB-eIBRS Not affected
     Vulnerability Srbds:                Not affected
     Vulnerability Tsx async abort:      Mitigation; Clear CPU buffers; SMT vulnerable
     Flags:                              fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ss ht tm pbe syscall nx pdpe1gb rdtscp lm constant_tsc arch_perfmon pebs bts rep_good nopl xtopology nonstop_tsc cpuid aperfmperf pni pclmulqdq dtes64 monitor ds_cpl vmx smx est tm2 ssse3 sdbg fma cx16 xtpr pdcm pcid dca sse4_1 sse4_2 x2apic movbe popcnt tsc_deadline_timer aes xsave avx f16c rdrand lahf_lm abm 3dnowprefetch cpuid_fault epb cat_l3 cdp_l3 invpcid_single pti intel_ppin ssbd ibrs ibpb stibp tpr_shadow vnmi flexpriority ept vpid ept_ad fsgsbase tsc_adjust bmi1 hle avx2 smep bmi2 erms invpcid rtm cqm rdt_a rdseed adx smap intel_pt xsaveopt cqm_llc cqm_occup_llc cqm_mbm_total cqm_mbm_local dtherm ida arat pln pts md_clear flush_l1d

     ***CMake***
     /usr/bin/cmake
     cmake version 3.16.3

     CMake suite maintained and supported by Kitware (kitware.com/cmake).

     ***g++***
     /usr/bin/g++
     g++ (Ubuntu 9.4.0-1ubuntu1~20.04.2) 9.4.0
     Copyright (C) 2019 Free Software Foundation, Inc.
     This is free software; see the source for copying conditions.  There is NO
     warranty; not even for MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.

     ***nvcc***
     /usr/local/cuda/bin/nvcc
     nvcc: NVIDIA (R) Cuda compiler driver
     Copyright (c) 2005-2022 NVIDIA Corporation
     Built on Wed_Sep_21_10:33:58_PDT_2022
     Cuda compilation tools, release 11.8, V11.8.89
     Build cuda_11.8.r11.8/compiler.31833905_0

     ***Python***
     /datasets/charlesb/micromamba/envs/dask-sql-gpuci-py39-1/bin/python
     Python 3.9.18

     ***Environment Variables***
     PATH                            : /datasets/charlesb/micromamba/envs/dask-sql-gpuci-py39-1/bin:/home/nfs/charlesb/.local/bin:/home/nfs/charlesb/.vscode-server/bin/abd2f3db4bdb28f9e95536dfa84d8479f1eb312d/bin/remote-cli:/datasets/charlesb/micromamba/condabin:/home/nfs/charlesb/.local/bin:/usr/local/cuda/bin:/opt/bin/:/home/nfs/charlesb/.cargo/bin:/usr/local/cuda/bin:/opt/bin/:/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin:/usr/games:/usr/local/games:/snap/bin:/usr/local/cuda/bin
     LD_LIBRARY_PATH                 :
     NUMBAPRO_NVVM                   :
     NUMBAPRO_LIBDEVICE              :
     CONDA_PREFIX                    : /datasets/charlesb/micromamba/envs/dask-sql-gpuci-py39-1
     PYTHON_PATH                     :

     ***conda packages***
     conda is /datasets/charlesb/micromamba/condabin/conda
     /datasets/charlesb/micromamba/condabin/conda
     # packages in environment at /datasets/charlesb/micromamba/envs/dask-sql-gpuci-py39-1:
     #
     # Name                    Version                   Build  Channel
     _libgcc_mutex             0.1                 conda_forge    conda-forge
     _openmp_mutex             4.5                       2_gnu    conda-forge
     adagio                    0.2.4              pyhd8ed1ab_0    conda-forge
     alabaster                 0.7.13             pyhd8ed1ab_0    conda-forge
     alembic                   1.12.0             pyhd8ed1ab_0    conda-forge
     annotated-types           0.5.0              pyhd8ed1ab_0    conda-forge
     antlr-python-runtime      4.11.1             pyhd8ed1ab_0    conda-forge
     antlr4-python3-runtime    4.11.1             pyh1a96a4e_0    conda-forge
     anyio                     3.7.1              pyhd8ed1ab_0    conda-forge
     appdirs                   1.4.4              pyh9f0ad1d_0    conda-forge
     asttokens                 2.4.0              pyhd8ed1ab_0    conda-forge
     attrs                     23.1.0             pyh71513ae_1    conda-forge
     aws-c-auth                0.7.3                he2921ad_3    conda-forge
     aws-c-cal                 0.6.2                hc309b26_1    conda-forge
     aws-c-common              0.9.0                hd590300_0    conda-forge
     aws-c-compression         0.2.17               h4d4d85c_2    conda-forge
     aws-c-event-stream        0.3.2                h2e3709c_0    conda-forge
     aws-c-http                0.7.12               hc865f51_1    conda-forge
     aws-c-io                  0.13.32              h1a03231_3    conda-forge
     aws-c-mqtt                0.9.6                h3a0376c_0    conda-forge
     aws-c-s3                  0.3.17               h1678ad6_0    conda-forge
     aws-c-sdkutils            0.1.12               h4d4d85c_1    conda-forge
     aws-checksums             0.1.17               h4d4d85c_1    conda-forge
     aws-crt-cpp               0.23.1               hf7d0843_2    conda-forge
     aws-sdk-cpp               1.11.156             he6c2984_2    conda-forge
     babel                     2.12.1             pyhd8ed1ab_1    conda-forge
     backcall                  0.2.0              pyh9f0ad1d_0    conda-forge
     backports                 1.0                pyhd8ed1ab_3    conda-forge
     backports.functools_lru_cache 1.6.5              pyhd8ed1ab_0    conda-forge
     bcrypt                    4.0.1            py39h9fdd4d6_1    conda-forge
     binutils                  2.40                 hdd6e379_0    conda-forge
     binutils_impl_linux-64    2.40                 hf600244_0    conda-forge
     binutils_linux-64         2.40                 hbdbef99_2    conda-forge
     blinker                   1.6.2              pyhd8ed1ab_0    conda-forge
     bokeh                     3.2.2              pyhd8ed1ab_0    conda-forge
     brotli                    1.1.0                hd590300_0    conda-forge
     brotli-bin                1.1.0                hd590300_0    conda-forge
     brotli-python             1.1.0            py39h3d6467e_0    conda-forge
     bzip2                     1.0.8                h7f98852_4    conda-forge
     c-ares                    1.19.1               hd590300_0    conda-forge
     c-compiler                1.6.0                hd590300_0    conda-forge
     ca-certificates           2023.7.22            hbcca054_0    conda-forge
     cachetools                5.3.1              pyhd8ed1ab_0    conda-forge
     certifi                   2023.7.22          pyhd8ed1ab_0    conda-forge
     cffi                      1.15.1           py39h7a31438_5    conda-forge
     cfgv                      3.3.1              pyhd8ed1ab_0    conda-forge
     charset-normalizer        3.2.0              pyhd8ed1ab_0    conda-forge
     click                     8.1.7           unix_pyh707e725_0    conda-forge
     cloudpickle               2.2.1              pyhd8ed1ab_0    conda-forge
     colorama                  0.4.6              pyhd8ed1ab_0    conda-forge
     configparser              5.3.0              pyhd8ed1ab_0    conda-forge
     contourpy                 1.1.1            py39h7633fee_1    conda-forge
     coverage                  7.3.1            py39hd1e30aa_1    conda-forge
     cryptography              41.0.4           py39hd4f0224_0    conda-forge
     cubinlinker               0.3.0            py39hac6bf05_0    rapidsai
     cuda-profiler-api         11.8.86                       0    nvidia
     cuda-python               11.8.2           py39h2405124_0    conda-forge
     cuda-version              11.5                 h6c6c5af_2    conda-forge
     cudatoolkit               11.5.2              hbdc67f6_12    conda-forge
     cudf                      23.08.00        cuda11_py39_230809_g8150d38e08_0    rapidsai
     cuml                      23.08.00        cuda11_py39_230809_gd7162cdea_0    rapidsai
     cupy                      12.2.0           py39he1ee4c9_1    conda-forge
     cycler                    0.11.0             pyhd8ed1ab_0    conda-forge
     cytoolz                   0.12.2           py39hd1e30aa_1    conda-forge
     dask                      2023.7.1           pyhd8ed1ab_0    conda-forge
     dask-core                 2023.7.1           pyhd8ed1ab_0    conda-forge
     dask-cuda                 23.08.00        py39_230809_gefbd6ca_0    rapidsai
     dask-cudf                 23.08.00        cuda11_py39_230809_g8150d38e08_0    rapidsai
     databricks-cli            0.17.7             pyhd8ed1ab_0    conda-forge
     deap                      1.4.1            py39hddac248_1    conda-forge
     decorator                 5.1.1              pyhd8ed1ab_0    conda-forge
     distlib                   0.3.7              pyhd8ed1ab_0    conda-forge
     distributed               2023.7.1           pyhd8ed1ab_0    conda-forge
     dlpack                    0.5                  h9c3ff4c_0    conda-forge
     docker-py                 6.1.3              pyhd8ed1ab_0    conda-forge
     docutils                  0.20.1           py39hf3d152e_2    conda-forge
     entrypoints               0.4                pyhd8ed1ab_0    conda-forge
     exceptiongroup            1.1.3              pyhd8ed1ab_0    conda-forge
     execnet                   2.0.2              pyhd8ed1ab_0    conda-forge
     executing                 1.2.0              pyhd8ed1ab_0    conda-forge
     fastapi                   0.103.1            pyhd8ed1ab_0    conda-forge
     fastrlock                 0.8.2            py39h3d6467e_1    conda-forge
     filelock                  3.12.4             pyhd8ed1ab_0    conda-forge
     flask                     2.3.3              pyhd8ed1ab_0    conda-forge
     fmt                       9.1.0                h924138e_0    conda-forge
     fonttools                 4.42.1           py39hd1e30aa_0    conda-forge
     freetype                  2.12.1               h267a509_2    conda-forge
     fs                        2.4.16             pyhd8ed1ab_0    conda-forge
     fsspec                    2023.9.2           pyh1a96a4e_0    conda-forge
     fugue                     0.8.6              pyhd8ed1ab_0    conda-forge
     fugue-sql-antlr           0.1.7              pyhd8ed1ab_0    conda-forge
     future                    0.18.3             pyhd8ed1ab_0    conda-forge
     gcc                       12.3.0               h8d2909c_2    conda-forge
     gcc_impl_linux-64         12.3.0               he2b93b0_2    conda-forge
     gcc_linux-64              12.3.0               h76fc315_2    conda-forge
     gflags                    2.2.2             he1b5a44_1004    conda-forge
     gitdb                     4.0.10             pyhd8ed1ab_0    conda-forge
     gitpython                 3.1.37             pyhd8ed1ab_0    conda-forge
     glog                      0.6.0                h6f12383_0    conda-forge
     gmock                     1.14.0               ha770c72_1    conda-forge
     greenlet                  2.0.2            py39h3d6467e_1    conda-forge
     gtest                     1.14.0               h00ab1b0_1    conda-forge
     gunicorn                  20.1.0           py39hf3d152e_3    conda-forge
     h11                       0.14.0             pyhd8ed1ab_0    conda-forge
     h2                        4.1.0              pyhd8ed1ab_0    conda-forge
     hpack                     4.0.0              pyh9f0ad1d_0    conda-forge
     httpcore                  0.18.0             pyhd8ed1ab_0    conda-forge
     httpx                     0.25.0             pyhd8ed1ab_0    conda-forge
     hyperframe                6.0.1              pyhd8ed1ab_0    conda-forge
     identify                  2.5.29             pyhd8ed1ab_0    conda-forge
     idna                      3.4                pyhd8ed1ab_0    conda-forge
     imagesize                 1.4.1              pyhd8ed1ab_0    conda-forge
     importlib-metadata        6.8.0              pyha770c72_0    conda-forge
     importlib-resources       6.1.0              pyhd8ed1ab_0    conda-forge
     importlib_metadata        6.8.0                hd8ed1ab_0    conda-forge
     importlib_resources       6.1.0              pyhd8ed1ab_0    conda-forge
     iniconfig                 2.0.0              pyhd8ed1ab_0    conda-forge
     intake                    0.7.0              pyhd8ed1ab_0    conda-forge
     ipython                   8.15.0             pyh0d859eb_0    conda-forge
     itsdangerous              2.1.2              pyhd8ed1ab_0    conda-forge
     jedi                      0.19.0             pyhd8ed1ab_0    conda-forge
     jinja2                    3.1.2              pyhd8ed1ab_1    conda-forge
     joblib                    1.3.2              pyhd8ed1ab_0    conda-forge
     jsonschema                4.19.1             pyhd8ed1ab_0    conda-forge
     jsonschema-specifications 2023.7.1           pyhd8ed1ab_0    conda-forge
     kernel-headers_linux-64   2.6.32              he073ed8_16    conda-forge
     keyutils                  1.6.1                h166bdaf_0    conda-forge
     kiwisolver                1.4.5            py39h7633fee_1    conda-forge
     krb5                      1.21.2               h659d440_0    conda-forge
     lcms2                     2.15                 h7f713cb_2    conda-forge
     ld_impl_linux-64          2.40                 h41732ed_0    conda-forge
     lerc                      4.0.0                h27087fc_0    conda-forge
     libabseil                 20230802.1      cxx17_h59595ed_0    conda-forge
     libarrow                  11.0.0          h1935d02_38_cpu    conda-forge
     libblas                   3.9.0           18_linux64_openblas    conda-forge
     libbrotlicommon           1.1.0                hd590300_0    conda-forge
     libbrotlidec              1.1.0                hd590300_0    conda-forge
     libbrotlienc              1.1.0                hd590300_0    conda-forge
     libcblas                  3.9.0           18_linux64_openblas    conda-forge
     libcrc32c                 1.1.2                h9c3ff4c_0    conda-forge
     libcublas                 11.11.3.6                     0    nvidia
     libcublas-dev             11.11.3.6                     0    nvidia
     libcudf                   23.08.00        cuda11_230809_g8150d38e08_0    rapidsai
     libcufft                  10.9.0.58                     0    nvidia
     libcufile                 1.4.0.31                      0    nvidia
     libcufile-dev             1.4.0.31                      0    nvidia
     libcuml                   23.08.00        cuda11_230809_gd7162cdea_0    rapidsai
     libcumlprims              23.08.00        cuda11_230809_g71c0a86_0    nvidia
     libcurand                 10.3.0.86                     0    nvidia
     libcurand-dev             10.3.0.86                     0    nvidia
     libcurl                   8.3.0                hca28451_0    conda-forge
     libcusolver               11.4.1.48                     0    nvidia
     libcusolver-dev           11.4.1.48                     0    nvidia
     libcusparse               11.7.5.86                     0    nvidia
     libcusparse-dev           11.7.5.86                     0    nvidia
     libdeflate                1.19                 hd590300_0    conda-forge
     libedit                   3.1.20191231         he28a2e2_2    conda-forge
     libev                     4.33                 h516909a_1    conda-forge
     libevent                  2.1.12               hf998b51_1    conda-forge
     libffi                    3.4.2                h7f98852_5    conda-forge
     libgcc-devel_linux-64     12.3.0               h8bca6fd_2    conda-forge
     libgcc-ng                 13.2.0               h807b86a_2    conda-forge
     libgfortran-ng            13.2.0               h69a702a_2    conda-forge
     libgfortran5              13.2.0               ha4646dd_2    conda-forge
     libgomp                   13.2.0               h807b86a_2    conda-forge
     libgoogle-cloud           2.12.0               h8d7e28b_2    conda-forge
     libgrpc                   1.57.0               ha4d0f93_1    conda-forge
     libjpeg-turbo             2.1.5.1              hd590300_1    conda-forge
     libkvikio                 23.08.00        cuda11_230809_g51a9036_0    rapidsai
     liblapack                 3.9.0           18_linux64_openblas    conda-forge
     libllvm14                 14.0.6               hcd5def8_4    conda-forge
     libnghttp2                1.52.0               h61bc06f_0    conda-forge
     libnsl                    2.0.0                h7f98852_0    conda-forge
     libnuma                   2.0.16               h0b41bf4_1    conda-forge
     libopenblas               0.3.24          pthreads_h413a1c8_0    conda-forge
     libpng                    1.6.39               h753d276_0    conda-forge
     libpq                     15.4                 hfc447b1_1    conda-forge
     libprotobuf               4.23.4               hf27288f_6    conda-forge
     libraft                   23.08.00        cuda11_230809_ge588d7b5_0    rapidsai
     libraft-headers           23.08.00        cuda11_230809_ge588d7b5_0    rapidsai
     libraft-headers-only      23.08.00        cuda11_230809_ge588d7b5_0    rapidsai
     librmm                    23.08.00        cuda11_230809_gf3af0e8d_0    rapidsai
     libsanitizer              12.3.0               h0f45ef3_2    conda-forge
     libsodium                 1.0.18               h36c2ea0_1    conda-forge
     libsqlite                 3.43.0               h2797004_0    conda-forge
     libssh2                   1.11.0               h0841786_0    conda-forge
     libstdcxx-ng              13.2.0               h7e041cc_2    conda-forge
     libthrift                 0.19.0               h8fd135c_0    conda-forge
     libtiff                   4.6.0                h29866fb_1    conda-forge
     libutf8proc               2.8.0                h166bdaf_0    conda-forge
     libuuid                   2.38.1               h0b41bf4_0    conda-forge
     libwebp-base              1.3.2                hd590300_0    conda-forge
     libxcb                    1.15                 h0b41bf4_0    conda-forge
     libxgboost                1.7.4           rapidsai_ha9c50b3_6    rapidsai
     libzlib                   1.2.13               hd590300_5    conda-forge
     lightgbm                  4.0.0            py39h3d6467e_0    conda-forge
     llvmlite                  0.40.1           py39h174d805_0    conda-forge
     locket                    1.0.0              pyhd8ed1ab_0    conda-forge
     lz4                       4.3.2            py39h79d96da_1    conda-forge
     lz4-c                     1.9.4                hcb278e6_0    conda-forge
     mako                      1.2.4              pyhd8ed1ab_0    conda-forge
     markdown                  3.4.4              pyhd8ed1ab_0    conda-forge
     markupsafe                2.1.3            py39hd1e30aa_1    conda-forge
     matplotlib-base           3.8.0            py39he9076e7_1    conda-forge
     matplotlib-inline         0.1.6              pyhd8ed1ab_0    conda-forge
     maturin                   1.1.0            py39hd4f0224_0    conda-forge
     mlflow                    2.5.0            py39ha39b057_0    conda-forge
     mock                      5.1.0              pyhd8ed1ab_0    conda-forge
     msgpack-python            1.0.6            py39h7633fee_0    conda-forge
     munkres                   1.1.4              pyh9f0ad1d_0    conda-forge
     nccl                      2.18.5.1             h0800d71_1    conda-forge
     ncurses                   6.4                  hcb278e6_0    conda-forge
     nodeenv                   1.8.0              pyhd8ed1ab_0    conda-forge
     numba                     0.57.1           py39hb75a051_0    conda-forge
     numpy                     1.24.4           py39h6183b62_0    conda-forge
     nvcomp                    2.6.1                h0800d71_2    conda-forge
     nvtx                      0.2.8            py39hd1e30aa_0    conda-forge
     oauthlib                  3.2.2              pyhd8ed1ab_0    conda-forge
     openjpeg                  2.5.0                h488ebb8_3    conda-forge
     openssl                   3.1.3                hd590300_0    conda-forge
     orc                       1.9.0                h52d3b3c_2    conda-forge
     packaging                 23.1               pyhd8ed1ab_0    conda-forge
     pandas                    1.5.3            py39h2ad29b5_1    conda-forge
     paramiko                  3.3.1              pyhd8ed1ab_0    conda-forge
     parso                     0.8.3              pyhd8ed1ab_0    conda-forge
     partd                     1.4.1              pyhd8ed1ab_0    conda-forge
     pexpect                   4.8.0              pyh1a96a4e_2    conda-forge
     pickleshare               0.7.5                   py_1003    conda-forge
     pillow                    10.0.1           py39h444a776_1    conda-forge
     pip                       23.2.1             pyhd8ed1ab_0    conda-forge
     pkgutil-resolve-name      1.3.10             pyhd8ed1ab_1    conda-forge
     platformdirs              3.10.0             pyhd8ed1ab_0    conda-forge
     pluggy                    1.3.0              pyhd8ed1ab_0    conda-forge
     pre-commit                3.4.0              pyha770c72_1    conda-forge
     prometheus_client         0.17.1             pyhd8ed1ab_0    conda-forge
     prometheus_flask_exporter 0.22.4             pyhd8ed1ab_0    conda-forge
     prompt-toolkit            3.0.39             pyha770c72_0    conda-forge
     prompt_toolkit            3.0.39               hd8ed1ab_0    conda-forge
     protobuf                  4.23.4           py39h60f6b12_3    conda-forge
     psutil                    5.9.5            py39hd1e30aa_1    conda-forge
     psycopg2                  2.9.7            py39ha29b39e_0    conda-forge
     pthread-stubs             0.4               h36c2ea0_1001    conda-forge
     ptxcompiler               0.8.1            py39h2405124_0    conda-forge
     ptyprocess                0.7.0              pyhd3deb0d_0    conda-forge
     pure-sasl                 0.6.2              pyhd8ed1ab_0    conda-forge
     pure_eval                 0.2.2              pyhd8ed1ab_0    conda-forge
     py-xgboost                1.7.4           rapidsai_py39h5088e0a_6    rapidsai
     pyarrow                   11.0.0          py39h1cb0ea7_38_cpu    conda-forge
     pycparser                 2.21               pyhd8ed1ab_0    conda-forge
     pydantic                  2.4.0              pyhd8ed1ab_0    conda-forge
     pydantic-core             2.10.0           py39h9fdd4d6_0    conda-forge
     pygments                  2.16.1             pyhd8ed1ab_0    conda-forge
     pyhive                    0.7.0              pyhd8ed1ab_0    conda-forge
     pyjwt                     2.8.0              pyhd8ed1ab_0    conda-forge
     pylibraft                 23.08.00        cuda11_py39_230809_ge588d7b5_0    rapidsai
     pynacl                    1.5.0            py39hd1e30aa_3    conda-forge
     pynvml                    11.4.1             pyhd8ed1ab_0    conda-forge
     pyparsing                 3.1.1              pyhd8ed1ab_0    conda-forge
     pysocks                   1.7.1              pyha2e5f31_6    conda-forge
     pytest                    7.4.2              pyhd8ed1ab_0    conda-forge
     pytest-cov                4.1.0              pyhd8ed1ab_0    conda-forge
     pytest-rerunfailures      12.0               pyhd8ed1ab_0    conda-forge
     pytest-xdist              3.3.1              pyhd8ed1ab_0    conda-forge
     python                    3.9.18          h0755675_0_cpython    conda-forge
     python-dateutil           2.8.2              pyhd8ed1ab_0    conda-forge
     python_abi                3.9                      4_cp39    conda-forge
     pytz                      2023.3.post1       pyhd8ed1ab_0    conda-forge
     pywin32-on-windows        0.1.0              pyh1179c8e_3    conda-forge
     pyyaml                    6.0.1            py39hd1e30aa_1    conda-forge
     qpd                       0.4.4              pyhd8ed1ab_1    conda-forge
     querystring_parser        1.2.4                      py_0    conda-forge
     raft-dask                 23.08.00        cuda11_py39_230809_ge588d7b5_0    rapidsai
     rdma-core                 28.9                 h59595ed_1    conda-forge
     re2                       2023.03.02           h8c504da_0    conda-forge
     readline                  8.2                  h8228510_1    conda-forge
     referencing               0.30.2             pyhd8ed1ab_0    conda-forge
     requests                  2.31.0             pyhd8ed1ab_0    conda-forge
     rmm                       23.08.00        cuda11_py39_230809_gf3af0e8d_0    rapidsai
     rpds-py                   0.10.3           py39h9fdd4d6_0    conda-forge
     s2n                       1.3.51               h06160fa_0    conda-forge
     scikit-learn              1.3.1            py39ha22ef79_0    conda-forge
     scipy                     1.11.2           py39h474f0d3_1    conda-forge
     setuptools                68.2.2             pyhd8ed1ab_0    conda-forge
     six                       1.16.0             pyh6c4a22f_0    conda-forge
     smmap                     3.0.5              pyh44b312d_0    conda-forge
     snappy                    1.1.10               h9fff704_0    conda-forge
     sniffio                   1.3.0              pyhd8ed1ab_0    conda-forge
     snowballstemmer           2.2.0              pyhd8ed1ab_0    conda-forge
     sortedcontainers          2.4.0              pyhd8ed1ab_0    conda-forge
     spdlog                    1.11.0               h9b3ece8_1    conda-forge
     sphinx                    7.2.6              pyhd8ed1ab_0    conda-forge
     sphinxcontrib-applehelp   1.0.7              pyhd8ed1ab_0    conda-forge
     sphinxcontrib-devhelp     1.0.5              pyhd8ed1ab_0    conda-forge
     sphinxcontrib-htmlhelp    2.0.4              pyhd8ed1ab_0    conda-forge
     sphinxcontrib-jsmath      1.0.1              pyhd8ed1ab_0    conda-forge
     sphinxcontrib-qthelp      1.0.6              pyhd8ed1ab_0    conda-forge
     sphinxcontrib-serializinghtml 1.1.9              pyhd8ed1ab_0    conda-forge
     sqlalchemy                1.4.49           py39hd1e30aa_0    conda-forge
     sqlglot                   18.7.0             pyhd8ed1ab_0    conda-forge
     sqlparse                  0.4.4              pyhd8ed1ab_0    conda-forge
     stack_data                0.6.2              pyhd8ed1ab_0    conda-forge
     starlette                 0.27.0             pyhd8ed1ab_0    conda-forge
     stopit                    1.1.2                      py_0    conda-forge
     sysroot_linux-64          2.12                he073ed8_16    conda-forge
     tabulate                  0.9.0              pyhd8ed1ab_1    conda-forge
     tblib                     2.0.0              pyhd8ed1ab_0    conda-forge
     threadpoolctl             3.2.0              pyha21a80b_0    conda-forge
     thrift                    0.19.0           py39h3d6467e_1    conda-forge
     thrift_sasl               0.4.3              pyhd8ed1ab_2    conda-forge
     tk                        8.6.13               h2797004_0    conda-forge
     toml                      0.10.2             pyhd8ed1ab_0    conda-forge
     tomli                     2.0.1              pyhd8ed1ab_0    conda-forge
     toolz                     0.12.0             pyhd8ed1ab_0    conda-forge
     tornado                   6.3.3            py39hd1e30aa_1    conda-forge
     tpot                      0.12.1             pyhd8ed1ab_0    conda-forge
     tqdm                      4.66.1             pyhd8ed1ab_0    conda-forge
     traitlets                 5.10.1             pyhd8ed1ab_0    conda-forge
     treelite                  3.2.0            py39h5f9d723_0    conda-forge
     treelite-runtime          3.2.0                    pypi_0    pypi
     triad                     0.9.1              pyhd8ed1ab_0    conda-forge
     typing-extensions         4.8.0                hd8ed1ab_0    conda-forge
     typing_extensions         4.8.0              pyha770c72_0    conda-forge
     tzdata                    2023c                h71feb2d_0    conda-forge
     tzlocal                   5.0.1            py39hf3d152e_1    conda-forge
     ucx                       1.14.1               h64cca9d_5    conda-forge
     ucx-proc                  1.0.0                       gpu    rapidsai
     ucx-py                    0.33.00         py39_230809_gea1eb8f_0    rapidsai
     ukkonen                   1.0.1            py39h7633fee_4    conda-forge
     unicodedata2              15.0.0           py39hd1e30aa_1    conda-forge
     update_checker            0.18.0             pyh9f0ad1d_0    conda-forge
     urllib3                   2.0.5              pyhd8ed1ab_0    conda-forge
     uvicorn                   0.23.2           py39hf3d152e_1    conda-forge
     virtualenv                20.24.4            pyhd8ed1ab_0    conda-forge
     wcwidth                   0.2.6              pyhd8ed1ab_0    conda-forge
     websocket-client          1.6.3              pyhd8ed1ab_0    conda-forge
     werkzeug                  2.3.7              pyhd8ed1ab_0    conda-forge
     wheel                     0.41.2             pyhd8ed1ab_0    conda-forge
     xgboost                   1.7.4           rapidsai_py39h5088e0a_6    rapidsai
     xorg-libxau               1.0.11               hd590300_0    conda-forge
     xorg-libxdmcp             1.1.3                h7f98852_0    conda-forge
     xyzservices               2023.7.0           pyhd8ed1ab_0    conda-forge
     xz                        5.2.6                h166bdaf_0    conda-forge
     yaml                      0.2.5                h7f98852_2    conda-forge
     zict                      3.0.0              pyhd8ed1ab_0    conda-forge
     zipp                      3.17.0             pyhd8ed1ab_0    conda-forge
     zlib                      1.2.13               hd590300_5    conda-forge
     zstd                      1.5.5                hfc55251_0    conda-forge

Additional context Noticed this while working on https://github.com/dask-contrib/dask-sql/pull/1220

shwina commented 1 year ago

@charlesbluca any chance of a cuDF-only reproducer?

wence- commented 11 months ago

The bug is that the dtype of the column representing the count aggregation is wrong if the dataframe is empty:

import cudf
df = cudf.DataFrame({"a": [1], "c": ["foo"]}, dtype={"a": "int64", "c": "object"})
df.groupby('a').agg({'c': "count"}).dtypes
# c    int64
# dtype: object

# But
df.iloc[:0].groupby('a').agg({'c': "count"}).dtypes
# c    object
# dtype: object