rapidsai / cudf

cuDF - GPU DataFrame Library
https://docs.rapids.ai/api/cudf/stable/
Apache License 2.0
8.35k stars 890 forks source link

[BUG] MultiIndex loc expects an iterable when passed Timestamp #8585

Open pbruneau opened 3 years ago

pbruneau commented 3 years ago

Describe the bug cuDF DataFrames indexed by a Timestamp range can be accessed using .loc[] without any problem. However, if the cuDF DataFrame is indexed with a MultiIndex with timestamps as the first key, .loc[] fails, when doing so causes no issue with pandas.

Steps/Code to reproduce bug The following gist holds a self-contained example. The last line of the code fails with error: TypeError: 'Timestamp' object is not iterable

Expected behavior I would expect the pandas and cuDF snippets to behave similarly.

Environment overview (please complete the following information)

Environment details

Click here to see environment details

     **git***
     commit 2cda39b34197c60614186ec51106d8254e5f7b05 (grafted, HEAD, origin/branch-0.16)
     Author: Ray Douglass <3107146+raydouglass@users.noreply.github.com>
     Date:   Wed Oct 21 10:31:49 2020 -0400

     Update CHANGELOG.md
     **git submodules***

     ***OS Information***
     DISTRIB_ID=Ubuntu
     DISTRIB_RELEASE=18.04
     DISTRIB_CODENAME=bionic
     DISTRIB_DESCRIPTION="Ubuntu 18.04.5 LTS"
     NAME="Ubuntu"
     VERSION="18.04.5 LTS (Bionic Beaver)"
     ID=ubuntu
     ID_LIKE=debian
     PRETTY_NAME="Ubuntu 18.04.5 LTS"
     VERSION_ID="18.04"
     HOME_URL="https://www.ubuntu.com/"
     SUPPORT_URL="https://help.ubuntu.com/"
     BUG_REPORT_URL="https://bugs.launchpad.net/ubuntu/"
     PRIVACY_POLICY_URL="https://www.ubuntu.com/legal/terms-and-policies/privacy-policy"
     VERSION_CODENAME=bionic
     UBUNTU_CODENAME=bionic
     Linux fe1b5c84b917 4.15.0-143-generic #147-Ubuntu SMP Wed Apr 14 16:10:11 UTC 2021 x86_64 x86_64 x86_64 GNU/Linux

     ***GPU Information***
     Tue Jun 22 15:01:51 2021
     +-----------------------------------------------------------------------------+
     | NVIDIA-SMI 455.23.05    Driver Version: 455.23.05    CUDA Version: 11.1     |
     |-------------------------------+----------------------+----------------------+
     | GPU  Name        Persistence-M| Bus-Id        Disp.A | Volatile Uncorr. ECC |
     | Fan  Temp  Perf  Pwr:Usage/Cap|         Memory-Usage | GPU-Util  Compute M. |
     |                               |                      |               MIG M. |
     |===============================+======================+======================|
     |   0  GeForce GTX 1080    On   | 00000000:05:00.0 Off |                  N/A |
     | 28%   43C    P8     7W / 180W |   1504MiB /  8114MiB |      0%      Default |
     |                               |                      |                  N/A |
     +-------------------------------+----------------------+----------------------+

     +-----------------------------------------------------------------------------+
     | Processes:                                                                  |
     |  GPU   GI   CI        PID   Type   Process name                  GPU Memory |
     |        ID   ID                                                   Usage      |
     |=============================================================================|
     +-----------------------------------------------------------------------------+

     ***CPU***
     Architecture:        x86_64
     CPU op-mode(s):      32-bit, 64-bit
     Byte Order:          Little Endian
     CPU(s):              12
     On-line CPU(s) list: 0-11
     Thread(s) per core:  2
     Core(s) per socket:  6
     Socket(s):           1
     NUMA node(s):        1
     Vendor ID:           GenuineIntel
     CPU family:          6
     Model:               79
     Model name:          Intel(R) Core(TM) i7-6800K CPU @ 3.40GHz
     Stepping:            1
     CPU MHz:             1200.861
     CPU max MHz:         3800.0000
     CPU min MHz:         1200.0000
     BogoMIPS:            6800.53
     Virtualization:      VT-x
     L1d cache:           32K
     L1i cache:           32K
     L2 cache:            256K
     L3 cache:            15360K
     NUMA node0 CPU(s):   0-11
     Flags:               fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ss ht tm pbe syscall nx pdpe1gb rdtscp lm constant_tsc arch_perfmon pebs bts rep_good nopl xtopology nonstop_tsc cpuid aperfmperf pni pclmulqdq dtes64 monitor ds_cpl vmx est tm2 ssse3 sdbg fma cx16 xtpr pdcm pcid dca sse4_1 sse4_2 x2apic movbe popcnt tsc_deadline_timer aes xsave avx f16c rdrand lahf_lm abm 3dnowprefetch cpuid_fault epb cat_l3 cdp_l3 invpcid_single pti intel_ppin ssbd ibrs ibpb stibp tpr_shadow vnmi flexpriority ept vpid fsgsbase tsc_adjust bmi1 hle avx2 smep bmi2 erms invpcid rtm cqm rdt_a rdseed adx smap intel_pt xsaveopt cqm_llc cqm_occup_llc cqm_mbm_total cqm_mbm_local dtherm ida arat pln pts md_clear flush_l1d

     ***CMake***

     ***g++***

     ***nvcc***

     ***Python***
     /opt/conda/envs/rapids/bin/python
     Python 3.7.10

     ***Environment Variables***
     PATH                            : /opt/conda/envs/rapids/bin:/opt/conda/condabin:/opt/conda/bin:/usr/local/nvidia/bin:/usr/local/cuda/bin:/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin
     LD_LIBRARY_PATH                 : /usr/local/nvidia/lib:/usr/local/nvidia/lib64
     NUMBAPRO_NVVM                   :
     NUMBAPRO_LIBDEVICE              :
     CONDA_PREFIX                    : /opt/conda/envs/rapids
     PYTHON_PATH                     :

     ***conda packages***
     /opt/conda/condabin/conda
     # packages in environment at /opt/conda/envs/rapids:
     #
     # Name                    Version                   Build  Channel
     _libgcc_mutex             0.1                 conda_forge    conda-forge
     _openmp_mutex             4.5                       1_gnu    conda-forge
     abseil-cpp                20200225.2           he1b5a44_2    conda-forge
     aiobotocore               1.2.1              pyhd8ed1ab_0    conda-forge
     aiohttp                   3.7.4            py37h5e8e339_0    conda-forge
     aioitertools              0.7.1              pyhd8ed1ab_0    conda-forge
     appdirs                   1.4.4              pyh9f0ad1d_0    conda-forge
     argon2-cffi               20.1.0           py37h5e8e339_2    conda-forge
     arrow-cpp                 1.0.1           py37h2318771_14_cuda    conda-forge
     arrow-cpp-proc            3.0.0                      cuda    conda-forge
     async-timeout             3.0.1                   py_1000    conda-forge
     async_generator           1.10                       py_0    conda-forge
     attrs                     20.3.0             pyhd3deb0d_0    conda-forge
     aws-c-common              0.4.59               h36c2ea0_1    conda-forge
     aws-c-event-stream        0.1.6                had2084c_6    conda-forge
     aws-checksums             0.1.10               h4e93380_0    conda-forge
     aws-sdk-cpp               1.8.63               h9b98462_0    conda-forge
     backcall                  0.2.0              pyh9f0ad1d_0    conda-forge
     backports                 1.0                        py_2    conda-forge
     backports.functools_lru_cache 1.6.1                      py_0    conda-forge
     blas                      2.14                   openblas    conda-forge
     blazingsql                0.18.0                   pypi_0    pypi
     bleach                    3.3.0              pyh44b312d_0    conda-forge
     bokeh                     2.2.3            py37h89c1867_0    conda-forge
     boost                     1.72.0           py37h48f8a5e_1    conda-forge
     boost-cpp                 1.72.0               h9d3c048_4    conda-forge
     botocore                  1.19.52            pyhd8ed1ab_0    conda-forge
     brotli                    1.0.9                h9c3ff4c_4    conda-forge
     brotlipy                  0.7.0           py37h5e8e339_1001    conda-forge
     bzip2                     1.0.8                h7f98852_4    conda-forge
     c-ares                    1.17.1               h36c2ea0_0    conda-forge
     ca-certificates           2020.12.5            ha878542_0    conda-forge
     cairo                     1.16.0            h6cf1ce9_1008    conda-forge
     certifi                   2020.12.5        py37h89c1867_1    conda-forge
     cffi                      1.14.5           py37hc58025e_0    conda-forge
     cfitsio                   3.470                h2e3daa1_7    conda-forge
     cftime                    1.5.0                    pypi_0    pypi
     chardet                   4.0.0            py37h89c1867_1    conda-forge
     click                     7.1.2              pyh9f0ad1d_0    conda-forge
     click-plugins             1.1.1                      py_0    conda-forge
     cligj                     0.7.1              pyhd8ed1ab_0    conda-forge
     cloudpickle               1.6.0                      py_0    conda-forge
     colorcet                  2.0.6              pyhd8ed1ab_0    conda-forge
     convertdate               2.3.2                    pypi_0    pypi
     cryptography              3.4.4            py37hf1a17b8_0    conda-forge
     cudatoolkit               10.1.243             h036e899_8    nvidia
     cudf                      0.18.0          cuda_10.1_py37_g20778e5ddb_0    rapidsai
     cudf_kafka                0.18.0          py37_g20778e5ddb_0    rapidsai
     cudnn                     7.6.0                cuda10.1_0    nvidia
     cugraph                   0.18.0          py37_g65ec965f_0    rapidsai
     cuml                      0.18.0          cuda10.1_py37_gb5f59e005_0    rapidsai
     cupy                      8.0.0            py37h0632833_0    conda-forge
     curl                      7.71.1               he644dc0_8    conda-forge
     cusignal                  0.18.0          py38_g42899d2_0    rapidsai
     cuspatial                 0.18.0a210212   py37_g3045c48_21    rapidsai-nightly
     custreamz                 0.18.0          py37_g20778e5ddb_0    rapidsai
     cuxfilter                 0.18.0          py37_gac6f488_0    rapidsai
     cycler                    0.10.0                     py_2    conda-forge
     cyrus-sasl                2.1.27               h3274739_1    conda-forge
     cython                    0.29.22          py37hcd2ae1e_0    conda-forge
     cytoolz                   0.11.0           py37h5e8e339_3    conda-forge
     dask                      2021.2.0           pyhd8ed1ab_0    conda-forge
     dask-core                 2021.2.0           pyhd8ed1ab_0    conda-forge
     dask-cuda                 0.18.0                   py37_0    rapidsai
     dask-cudf                 0.18.0          py37_g20778e5ddb_0    rapidsai
     dask-glm                  0.2.0                      py_1    conda-forge
     dask-labextension         4.0.1              pyhd8ed1ab_0    conda-forge
     dask-ml                   1.8.0              pyhd8ed1ab_0    conda-forge
     datashader                0.11.1             pyh9f0ad1d_0    conda-forge
     datashape                 0.5.4                      py_1    conda-forge
     decorator                 4.4.2                      py_0    conda-forge
     defusedxml                0.6.0                      py_0    conda-forge
     distlib                   0.3.2                    pypi_0    pypi
     distributed               2021.2.0         py37h89c1867_0    conda-forge
     dlpack                    0.3                  he1b5a44_1    conda-forge
     ecmwf-api-client          1.6.1                    pypi_0    pypi
     entrypoints               0.3             pyhd8ed1ab_1003    conda-forge
     ephem                     4.0.0.2                  pypi_0    pypi
     expat                     2.2.10               h9c3ff4c_0    conda-forge
     fa2                       0.3.5            py37h8f50634_0    conda-forge
     faiss-proc                1.0.0                      cuda    conda-forge
     fastavro                  1.3.4            py37h5e8e339_0    conda-forge
     fastrlock                 0.5              py37hcd2ae1e_2    conda-forge
     filelock                  3.0.12                   pypi_0    pypi
     filterpy                  1.4.5                      py_1    conda-forge
     fiona                     1.8.18           py37h527b4ca_0    conda-forge
     flask                     2.0.1                    pypi_0    pypi
     flask-wtf                 0.15.1                   pypi_0    pypi
     fontconfig                2.13.1            hba837de_1004    conda-forge
     freetype                  2.10.4               h0708190_1    conda-forge
     freexl                    1.0.6                h7f98852_0    conda-forge
     fsspec                    0.8.7              pyhd8ed1ab_0    conda-forge
     future                    0.18.2           py37h89c1867_3    conda-forge
     gdal                      3.1.4            py37h2ec2946_2    conda-forge
     geopandas                 0.8.1                      py_0    conda-forge
     geos                      3.8.1                he1b5a44_0    conda-forge
     geotiff                   1.6.0                h5d11630_3    conda-forge
     gettext                   0.19.8.1          h0b5b191_1005    conda-forge
     gflags                    2.2.2             he1b5a44_1004    conda-forge
     giflib                    5.2.1                h36c2ea0_2    conda-forge
     git                       2.30.1          pl5320h6697202_1    conda-forge
     glog                      0.4.0                h49b9bf7_3    conda-forge
     gluonts                   0.7.6                    pypi_0    pypi
     google-cloud-cpp          1.16.0               he4a878c_2    conda-forge
     google-cloud-cpp-common   0.25.0               he83eced_7    conda-forge
     googleapis-cpp            0.10.0               h6b1abdc_4    conda-forge
     gpuci-tools               0.3.1                         0    gpuci
     greenlet                  1.0.0            py37hcd2ae1e_0    conda-forge
     grpc-cpp                  1.32.0               h7997a97_1    conda-forge
     gunicorn                  20.1.0                   pypi_0    pypi
     hdf4                      4.2.13            h10796ff_1004    conda-forge
     hdf5                      1.10.6          nompi_h7c3c948_1111    conda-forge
     heapdict                  1.0.1                      py_0    conda-forge
     hijri-converter           2.1.2                    pypi_0    pypi
     holidays                  0.11.1                   pypi_0    pypi
     holoviews                 1.14.2             pyhd8ed1ab_0    conda-forge
     icu                       68.1                 h58526e2_0    conda-forge
     idna                      2.10               pyh9f0ad1d_0    conda-forge
     importlib-metadata        3.7.0            py37h89c1867_0    conda-forge
     importlib_metadata        3.7.0                hd8ed1ab_0    conda-forge
     iniconfig                 1.1.1              pyh9f0ad1d_0    conda-forge
     ipykernel                 5.5.0            py37h888b3d9_1    conda-forge
     ipython                   7.15.0           py37hc8dfbb8_0    conda-forge
     ipython_genutils          0.2.0                      py_1    conda-forge
     ipywidgets                7.6.3              pyhd3deb0d_0    conda-forge
     itsdangerous              2.0.1                    pypi_0    pypi
     jedi                      0.17.2           py37h89c1867_1    conda-forge
     jinja2                    3.0.1                    pypi_0    pypi
     jmespath                  0.10.0             pyh9f0ad1d_0    conda-forge
     joblib                    1.0.1              pyhd8ed1ab_0    conda-forge
     jpeg                      9d                   h36c2ea0_0    conda-forge
     jpype1                    1.2.1            py37h2527ec5_0    conda-forge
     json-c                    0.13.1            hbfbb72e_1002    conda-forge
     json5                     0.9.5              pyh9f0ad1d_0    conda-forge
     jsonschema                3.2.0                      py_2    conda-forge
     jupyter-server-proxy      1.6.0              pyhd8ed1ab_0    conda-forge
     jupyter_client            6.1.11             pyhd8ed1ab_1    conda-forge
     jupyter_core              4.7.1            py37h89c1867_0    conda-forge
     jupyterlab                2.1.5                      py_0    conda-forge
     jupyterlab-nvdashboard    0.1.11200212              py_12    rapidsai-nightly
     jupyterlab_pygments       0.1.2              pyh9f0ad1d_0    conda-forge
     jupyterlab_server         1.2.0                      py_0    conda-forge
     jupyterlab_widgets        1.0.0              pyhd8ed1ab_1    conda-forge
     kealib                    1.4.14               hcc255d8_2    conda-forge
     kiwisolver                1.3.1            py37h2527ec5_1    conda-forge
     korean-lunar-calendar     0.2.1                    pypi_0    pypi
     krb5                      1.17.2               h926e7f8_0    conda-forge
     lcms2                     2.12                 hddcbb42_0    conda-forge
     ld_impl_linux-64          2.35.1               hea4e1c9_2    conda-forge
     libblas                   3.8.0               14_openblas    conda-forge
     libcblas                  3.8.0               14_openblas    conda-forge
     libcrc32c                 1.1.1                h9c3ff4c_2    conda-forge
     libcudf                   0.18.1          cuda10.1_g999be56c80_0    rapidsai
     libcudf_kafka             0.18.0a210226   g1544474166_254    rapidsai-nightly
     libcugraph                0.18.0          cuda10.1_g65ec965f_0    rapidsai
     libcuml                   0.18.0          cuda10.1_gb5f59e005_0    rapidsai
     libcumlprims              0.18.0a210211   cuda10.1_gff080f3_0    rapidsai-nightly
     libcurl                   7.71.1               hcdd3856_8    conda-forge
     libcuspatial              0.18.0          cuda10.1_gf4da460_0    rapidsai
     libdap4                   3.20.6               hd7c4107_1    conda-forge
     libedit                   3.1.20191231         he28a2e2_2    conda-forge
     libev                     4.33                 h516909a_1    conda-forge
     libevent                  2.1.10               hcdb4288_3    conda-forge
     libfaiss                  1.6.3           he68dc02_3_cuda    conda-forge
     libffi                    3.3                  h58526e2_2    conda-forge
     libgcc-ng                 9.3.0               h2828fa1_18    conda-forge
     libgcrypt                 1.9.2                h7f98852_0    conda-forge
     libgdal                   3.1.4                h02eeb80_2    conda-forge
     libgfortran-ng            7.5.0               h14aa051_18    conda-forge
     libgfortran4              7.5.0               h14aa051_18    conda-forge
     libglib                   2.68.0               h3e27bee_2    conda-forge
     libgomp                   9.3.0               h2828fa1_18    conda-forge
     libgpg-error              1.42                 h9c3ff4c_0    conda-forge
     libgsasl                  1.8.0                         2    conda-forge
     libhwloc                  2.3.0                h5e5b7d1_1    conda-forge
     libiconv                  1.16                 h516909a_0    conda-forge
     libkml                    1.3.0             hd79254b_1012    conda-forge
     liblapack                 3.8.0               14_openblas    conda-forge
     liblapacke                3.8.0               14_openblas    conda-forge
     libllvm10                 10.0.1               he513fc3_3    conda-forge
     libnetcdf                 4.7.4           nompi_h56d31a8_107    conda-forge
     libnghttp2                1.43.0               h812cca2_0    conda-forge
     libntlm                   1.4               h7f98852_1002    conda-forge
     libopenblas               0.3.7                h5ec1e0e_6    conda-forge
     libpng                    1.6.37               h21135ba_2    conda-forge
     libpq                     12.3                 h255efa7_3    conda-forge
     libprotobuf               3.13.0.1             h8b12597_0    conda-forge
     librdkafka                1.5.3                h54cafa9_0    conda-forge
     librmm                    0.18.0          cuda10.1_ga4ee6b7_0    rapidsai
     librttopo                 1.1.0                hb271727_4    conda-forge
     libsodium                 1.0.18               h36c2ea0_1    conda-forge
     libspatialindex           1.9.3                h9c3ff4c_3    conda-forge
     libspatialite             5.0.1                h6ec7341_0    conda-forge
     libssh2                   1.9.0                hab1572f_5    conda-forge
     libstdcxx-ng              9.3.0               h6de172a_18    conda-forge
     libthrift                 0.13.0               h5aa387f_6    conda-forge
     libtiff                   4.2.0                hdc55705_0    conda-forge
     libutf8proc               2.6.1                h7f98852_0    conda-forge
     libuuid                   2.32.1            h7f98852_1000    conda-forge
     libuv                     1.41.0               h7f98852_0    conda-forge
     libwebp                   1.2.0                h3452ae3_0    conda-forge
     libwebp-base              1.2.0                h7f98852_0    conda-forge
     libxcb                    1.13              h7f98852_1003    conda-forge
     libxgboost                1.3.3dev.rapidsai0.18      cuda10.1_0    rapidsai-nightly
     libxml2                   2.9.10               h72842e0_3    conda-forge
     line-profiler             3.3.0                    pypi_0    pypi
     llvmlite                  0.35.0           py37h9d7f4d0_1    conda-forge
     locket                    0.2.0                      py_2    conda-forge
     lz4-c                     1.9.2                he1b5a44_3    conda-forge
     markdown                  3.3.4              pyhd8ed1ab_0    conda-forge
     markupsafe                2.0.1                    pypi_0    pypi
     matplotlib-base           3.3.4            py37h0c9df89_0    conda-forge
     mistune                   0.8.4           py37h5e8e339_1003    conda-forge
     more-itertools            8.7.0              pyhd8ed1ab_0    conda-forge
     msgpack-python            1.0.2            py37h2527ec5_1    conda-forge
     multidict                 5.1.0            py37h5e8e339_1    conda-forge
     multipledispatch          0.6.0                      py_0    conda-forge
     munch                     2.5.0                      py_0    conda-forge
     mxnet-cu101               1.8.0                    pypi_0    pypi
     nbclient                  0.5.3              pyhd8ed1ab_0    conda-forge
     nbconvert                 6.0.7            py37h89c1867_3    conda-forge
     nbformat                  5.1.2              pyhd8ed1ab_1    conda-forge
     nccl                      2.8.4.1              h8b44402_3    conda-forge
     ncurses                   6.2                  h58526e2_4    conda-forge
     nest-asyncio              1.4.3              pyhd8ed1ab_0    conda-forge
     netcdf4                   1.5.6                    pypi_0    pypi
     netifaces                 0.10.9          py37h5e8e339_1003    conda-forge
     networkx                  2.5                        py_0    conda-forge
     nodejs                    14.15.4              h92b4a50_1    conda-forge
     notebook                  6.2.0            py37h89c1867_0    conda-forge
     numba                     0.52.0           py37hdc94413_0    conda-forge
     numpy                     1.19.5           py37haa41c4c_1    conda-forge
     nvtx                      0.2.3            py37h5e8e339_0    conda-forge
     olefile                   0.46               pyh9f0ad1d_1    conda-forge
     openjdk                   11.0.1            h516909a_1016    conda-forge
     openjpeg                  2.4.0                hf7af979_0    conda-forge
     openssl                   1.1.1k               h7f98852_0    conda-forge
     orc                       1.6.5                hd3605a7_0    conda-forge
     packaging                 20.9               pyh44b312d_0    conda-forge
     pandas                    1.1.5            py37hdc94413_0    conda-forge
     pandoc                    2.11.4               h7f98852_0    conda-forge
     pandocfilters             1.4.2                      py_1    conda-forge
     panel                     0.10.3             pyhd8ed1ab_0    conda-forge
     param                     1.10.1             pyhd3deb0d_0    conda-forge
     parquet-cpp               1.5.1                         2    conda-forge
     parso                     0.7.1              pyh9f0ad1d_0    conda-forge
     partd                     1.1.0                      py_0    conda-forge
     patsy                     0.5.1                      py_0    conda-forge
     pcre                      8.44                 he1b5a44_0    conda-forge
     perl                      5.32.0               h36c2ea0_0    conda-forge
     pexpect                   4.8.0              pyh9f0ad1d_2    conda-forge
     pickle5                   0.0.11           py37h8f50634_0    conda-forge
     pickleshare               0.7.5                   py_1003    conda-forge
     pillow                    8.1.1            py37h4600e1f_0    conda-forge
     pip                       21.0.1             pyhd8ed1ab_0    conda-forge
     pipenv                    2021.5.29                pypi_0    pypi
     pixman                    0.40.0               h36c2ea0_0    conda-forge
     pluggy                    0.13.1           py37h89c1867_4    conda-forge
     poppler                   0.89.0               h2de54a5_5    conda-forge
     poppler-data              0.4.10                        0    conda-forge
     postgresql                12.3                 hc2f5b80_3    conda-forge
     proj                      7.1.1                h966b41f_3    conda-forge
     prometheus_client         0.9.0              pyhd3deb0d_0    conda-forge
     prompt-toolkit            3.0.16             pyha770c72_0    conda-forge
     protobuf                  3.13.0.1         py37h745909e_1    conda-forge
     psutil                    5.8.0            py37h5e8e339_1    conda-forge
     pthread-stubs             0.4               h36c2ea0_1001    conda-forge
     ptyprocess                0.7.0              pyhd3deb0d_0    conda-forge
     pvlib                     0.8.1                    pypi_0    pypi
     py                        1.10.0             pyhd3deb0d_0    conda-forge
     py-xgboost                1.3.3dev.rapidsai0.18  cuda10.1py37_0    rapidsai-nightly
     pyarrow                   1.0.1           py37hbeecfa9_14_cuda    conda-forge
     pycparser                 2.20               pyh9f0ad1d_2    conda-forge
     pyct                      0.4.6                      py_0    conda-forge
     pyct-core                 0.4.6                      py_0    conda-forge
     pydantic                  1.8.2                    pypi_0    pypi
     pydeck                    0.5.0              pyh9f0ad1d_0    conda-forge
     pyee                      7.0.4              pyh9f0ad1d_0    conda-forge
     pyephem                   9.99                     pypi_0    pypi
     pygments                  2.8.0              pyhd8ed1ab_0    conda-forge
     pyhive                    0.6.3              pyhd3deb0d_0    conda-forge
     pymeeus                   0.5.11                   pypi_0    pypi
     pynndescent               0.5.2              pyh44b312d_0    conda-forge
     pynvml                    8.0.4                      py_1    conda-forge
     pyopenssl                 20.0.1             pyhd8ed1ab_0    conda-forge
     pyparsing                 2.4.7              pyh9f0ad1d_0    conda-forge
     pyppeteer                 0.2.2                      py_1    conda-forge
     pyproj                    2.6.1.post1      py37h6415a23_3    conda-forge
     pyrsistent                0.17.3           py37h5e8e339_2    conda-forge
     pysocks                   1.7.1            py37h89c1867_3    conda-forge
     pytest                    6.2.2            py37h89c1867_0    conda-forge
     python                    3.7.10          hffdb5ce_100_cpython    conda-forge
     python-confluent-kafka    1.5.0            py37h8f50634_0    conda-forge
     python-dateutil           2.8.1                      py_0    conda-forge
     python-graphviz           0.8.4                    pypi_0    pypi
     python_abi                3.7                     1_cp37m    conda-forge
     pytz                      2021.1             pyhd8ed1ab_0    conda-forge
     pyviz_comms               2.0.1              pyhd3deb0d_0    conda-forge
     pyyaml                    5.4.1            py37h5e8e339_0    conda-forge
     pyzmq                     22.0.3           py37h336d617_1    conda-forge
     rapids                    0.18.0a210302   cuda10.1_py37_g58c5d18_220    rapidsai-nightly
     rapids-blazing            0.18.0a210302   cuda10.1_py37_g58c5d18_220    rapidsai-nightly
     rapids-xgboost            0.18.0a210302   cuda10.1_py37_g58c5d18_220    rapidsai-nightly
     re2                       2020.10.01           he1b5a44_0    conda-forge
     readline                  8.0                  he28a2e2_2    conda-forge
     requests                  2.25.1             pyhd3deb0d_0    conda-forge
     rmm                       0.18.0          cuda_10.1_py37_ga4ee6b7_0    rapidsai
     rtree                     0.9.7            py37h0b55af0_1    conda-forge
     s2n                       1.0.0                h9b69904_0    conda-forge
     s3fs                      0.5.2              pyhd8ed1ab_0    conda-forge
     sasl                      0.2.1           py37h3340039_1002    conda-forge
     scikit-learn              0.23.1           py37h8a51577_0    conda-forge
     scipy                     1.5.3            py37h8911b10_0    conda-forge
     seaborn                   0.11.1               hd8ed1ab_1    conda-forge
     seaborn-base              0.11.1             pyhd8ed1ab_1    conda-forge
     send2trash                1.5.0                      py_0    conda-forge
     setuptools                49.6.0           py37h89c1867_3    conda-forge
     shapely                   1.7.1            py37hba0730f_1    conda-forge
     simpervisor               0.4                pyhd8ed1ab_0    conda-forge
     six                       1.15.0             pyh9f0ad1d_0    conda-forge
     snappy                    1.1.8                he1b5a44_3    conda-forge
     sortedcontainers          2.3.0              pyhd8ed1ab_0    conda-forge
     spdlog                    1.7.0                hc9558a2_2    conda-forge
     sqlalchemy                1.4.3            py37h5e8e339_0    conda-forge
     sqlite                    3.34.0               h74cdb3f_0    conda-forge
     statsmodels               0.12.2           py37h902c9e0_0    conda-forge
     streamz                   0.6.2              pyh44b312d_0    conda-forge
     tbb                       2020.2               h4bd325d_3    conda-forge
     tblib                     1.6.0                      py_0    conda-forge
     terminado                 0.9.2            py37h89c1867_0    conda-forge
     testpath                  0.4.4                      py_0    conda-forge
     threadpoolctl             2.1.0              pyh5ca1d4c_0    conda-forge
     thrift                    0.13.0           py37hcd2ae1e_2    conda-forge
     thrift_sasl               0.4.2            py37h8f50634_0    conda-forge
     tiledb                    2.1.6                h1022b9d_0    conda-forge
     tk                        8.6.10               h21135ba_1    conda-forge
     toml                      0.10.2             pyhd8ed1ab_0    conda-forge
     toolz                     0.11.1                     py_0    conda-forge
     tornado                   6.1              py37h5e8e339_1    conda-forge
     tqdm                      4.58.0             pyhd8ed1ab_0    conda-forge
     traitlets                 5.0.5                      py_0    conda-forge
     treelite                  1.0.0            py37hc731546_0    conda-forge
     treelite-runtime          1.0.0                    pypi_0    pypi
     typing-extensions         3.7.4.3                       0    conda-forge
     typing_extensions         3.7.4.3                    py_0    conda-forge
     tzcode                    2021a                h7f98852_1    conda-forge
     ucx                       1.9.0+gcd9efd3       cuda10.1_0    rapidsai-nightly
     ucx-proc                  1.0.0                       gpu    rapidsai-nightly
     ucx-py                    0.18.0a210323   py37_gcd9efd3_19    rapidsai-nightly
     umap-learn                0.5.1            py37h89c1867_0    conda-forge
     urllib3                   1.26.3             pyhd8ed1ab_0    conda-forge
     virtualenv                20.4.7                   pypi_0    pypi
     virtualenv-clone          0.5.4                    pypi_0    pypi
     wcwidth                   0.2.5              pyh9f0ad1d_2    conda-forge
     webencodings              0.5.1                      py_1    conda-forge
     websockets                8.1              py37h5e8e339_3    conda-forge
     werkzeug                  2.0.1                    pypi_0    pypi
     wheel                     0.36.2             pyhd3deb0d_0    conda-forge
     widgetsnbextension        3.5.1            py37h89c1867_4    conda-forge
     wrapt                     1.12.1           py37h5e8e339_3    conda-forge
     wtforms                   2.3.3                    pypi_0    pypi
     xarray                    0.17.0             pyhd8ed1ab_0    conda-forge
     xerces-c                  3.2.3                h9d8b166_2    conda-forge
     xgboost                   1.3.3dev.rapidsai0.18  cuda10.1py37_0    rapidsai-nightly
     xorg-kbproto              1.0.7             h7f98852_1002    conda-forge
     xorg-libice               1.0.10               h7f98852_0    conda-forge
     xorg-libsm                1.2.3             hd9c2040_1000    conda-forge
     xorg-libx11               1.7.0                h7f98852_0    conda-forge
     xorg-libxau               1.0.9                h7f98852_0    conda-forge
     xorg-libxdmcp             1.1.3                h7f98852_0    conda-forge
     xorg-libxext              1.3.4                h7f98852_1    conda-forge
     xorg-libxrender           0.9.10            h7f98852_1003    conda-forge
     xorg-renderproto          0.11.1            h7f98852_1002    conda-forge
     xorg-xextproto            7.3.0             h7f98852_1002    conda-forge
     xorg-xproto               7.0.31            h7f98852_1007    conda-forge
     xz                        5.2.5                h516909a_1    conda-forge
     yaml                      0.2.5                h516909a_0    conda-forge
     yarl                      1.6.3            py37h5e8e339_1    conda-forge
     zeromq                    4.3.4                h9c3ff4c_0    conda-forge
     zict                      2.0.0                      py_0    conda-forge
     zipp                      3.4.0                      py_0    conda-forge
     zlib                      1.2.11            h516909a_1010    conda-forge
     zstd                      1.4.8                hdf46e1d_0    conda-forge

beckernick commented 3 years ago

Thanks for including a simple reproducer gist. I've included it below for ease of access.

import pandas as pd
import cudf
import numpy as np
​
start = pd.Timestamp(datetime.strptime('2021-03-12 00:00+0000',  '%Y-%m-%d %H:%M%z'))
end = pd.Timestamp(datetime.strptime('2021-03-12 03:00+0000',  '%Y-%m-%d %H:%M%z'))
timestamps = pd.date_range(start, end, freq='1H')
labels = ['A', 'B', 'C']
index = pd.MultiIndex.from_product([timestamps, labels], names=["timestamp", "label"])
value = np.random.normal(size=12)
df = pd.DataFrame(value, index=index, columns=['value'])
df_gpu = cudf.from_pandas(df)
​
stamp = pd.Timestamp(datetime.strptime('2021-03-12 02:00+0000',  '%Y-%m-%d %H:%M%z'))
​
print(df.loc[stamp]) # SUCCEEDS
print(df_gpu.loc[stamp]) # FAILS
          value
label          
A      1.184793
B     -0.253166
C     -0.790236
---------------------------------------------------------------------------
TypeError                                 Traceback (most recent call last)
/raid/nicholasb/miniconda3/envs/rapids-21.08/lib/python3.8/site-packages/cudf/core/indexing.py in __getitem__(self, arg)
    234             try:
--> 235                 return self._getitem_tuple_arg(arg)
    236             except (TypeError, KeyError, IndexError, ValueError):

/raid/nicholasb/miniconda3/envs/rapids-21.08/lib/python3.8/contextlib.py in inner(*args, **kwds)
     74             with self._recreate_cm():
---> 75                 return func(*args, **kwds)
     76         return inner

/raid/nicholasb/miniconda3/envs/rapids-21.08/lib/python3.8/site-packages/cudf/core/indexing.py in _getitem_tuple_arg(self, arg)
    360                 else:
--> 361                     return columns_df.index._get_row_major(columns_df, arg)
    362         else:

/raid/nicholasb/miniconda3/envs/rapids-21.08/lib/python3.8/site-packages/cudf/core/multiindex.py in _get_row_major(self, df, row_tuple)
    926                 row_tuple = slice(row_tuple.start, self[-1], row_tuple.step)
--> 927         self._validate_indexer(row_tuple)
    928         valid_indices = self._get_valid_indices_by_tuple(

/raid/nicholasb/miniconda3/envs/rapids-21.08/lib/python3.8/site-packages/cudf/core/multiindex.py in _validate_indexer(self, indexer)
    958         else:
--> 959             for i in indexer:
    960                 self._validate_indexer(i)

TypeError: 'Timestamp' object is not iterable

During handling of the above exception, another exception occurred:

TypeError                                 Traceback (most recent call last)
<ipython-input-87-fe779946243b> in <module>
     15 
     16 print(df.loc[stamp]) # SUCCEEDS
---> 17 print(df_gpu.loc[stamp]) # FAILS

/raid/nicholasb/miniconda3/envs/rapids-21.08/lib/python3.8/site-packages/cudf/core/indexing.py in __getitem__(self, arg)
    235                 return self._getitem_tuple_arg(arg)
    236             except (TypeError, KeyError, IndexError, ValueError):
--> 237                 return self._getitem_tuple_arg((arg, slice(None)))
    238         else:
    239             if not isinstance(arg, tuple):

/raid/nicholasb/miniconda3/envs/rapids-21.08/lib/python3.8/contextlib.py in inner(*args, **kwds)
     73         def inner(*args, **kwds):
     74             with self._recreate_cm():
---> 75                 return func(*args, **kwds)
     76         return inner
     77 

/raid/nicholasb/miniconda3/envs/rapids-21.08/lib/python3.8/site-packages/cudf/core/indexing.py in _getitem_tuple_arg(self, arg)
    357             else:
    358                 if isinstance(arg, tuple):
--> 359                     return columns_df.index._get_row_major(columns_df, arg[0])
    360                 else:
    361                     return columns_df.index._get_row_major(columns_df, arg)

/raid/nicholasb/miniconda3/envs/rapids-21.08/lib/python3.8/site-packages/cudf/core/multiindex.py in _get_row_major(self, df, row_tuple)
    925             if row_tuple.stop is None:
    926                 row_tuple = slice(row_tuple.start, self[-1], row_tuple.step)
--> 927         self._validate_indexer(row_tuple)
    928         valid_indices = self._get_valid_indices_by_tuple(
    929             df.index, row_tuple, len(df.index)

/raid/nicholasb/miniconda3/envs/rapids-21.08/lib/python3.8/site-packages/cudf/core/multiindex.py in _validate_indexer(self, indexer)
    957             self._validate_indexer(indexer.stop)
    958         else:
--> 959             for i in indexer:
    960                 self._validate_indexer(i)
    961 

TypeError: 'Timestamp' object is not iterable

It looks like we go down a codepath that expects an iterable, which explains why wrapping with a tuple works (and may resolve your problem in the short term):

print(df_gpu.loc[(stamp,)]) # SUCCEEDS
          value
label          
A      1.184793
B     -0.253166
C     -0.790236
pbruneau commented 3 years ago

Hi @beckernick, thanks for the answer!

The "tuple trick" above seems to do the job for accessing a single value. However, I'm back into trouble if I want to fetch values for a timestamp range.

Elaborating from my previous gist example, if I type:

start = pd.Timestamp(datetime.strptime('2021-03-12 01:00+0000',  '%Y-%m-%d %H:%M%z'))
end = pd.Timestamp(datetime.strptime('2021-03-12 02:00+0000',  '%Y-%m-%d %H:%M%z'))
print(df.loc[start:end])

I get the expected result:

                                    value
timestamp                 label          
2021-03-12 01:00:00+00:00 A     -0.466112
                          B     -0.781473
                          C     -1.010174
2021-03-12 02:00:00+00:00 A      0.160179
                          B      1.007183
                          C     -1.053772

With cuDF, the following gets the usual TypeError: 'Timestamp' object is not iterable:

print(df_gpu.loc[start:end])

Alternatively, trying:

print(df_gpu.loc[(start:end,)])

gets a SyntaxError: invalid syntax. Using a regular Timestamp range with:

start = pd.Timestamp(datetime.strptime('2021-03-12 01:00+0000',  '%Y-%m-%d %H:%M%z'))
end = pd.Timestamp(datetime.strptime('2021-03-12 02:00+0000',  '%Y-%m-%d %H:%M%z'))
timestamps = pd.date_range(start, end, freq='1H')
print(df_gpu.loc[(timestamps,)])

I get ValueError: The truth value of an array with more than one element is ambiguous. Use a.any() or a.all(). Any idea for circumventing this issue?

pbruneau commented 3 years ago

Hi,

I'm following up about the bug reported above, as reported in my last answer, using a tuple to access a Timestamp first level of a MultiIndex circumvents the issue pointed out initially, but the proposed solution fails if one wants to access a Timestamp range.

I realize that the title is not accurately reflecting the actually remaining bug: should I create a new issue which singles out the Timestamp range bug, or rename this one?

beckernick commented 2 years ago

We've been refactoring our MultiIndex implementation to help make it more efficient and maintainable. Is the pandas snippet in the comment above a minimal example of the desired behavior @pbruneau ?

pbruneau commented 2 years ago

Hi @beckernick,

Here is an updated minimal gist which lists in details what works, does not work, and workarounds (as of version 21.08.02 installed on my side).

In a nutshell (please refer to the gist for details): with a MultiIndex and timestamps as primary key, pandas allows to do this kind of operation:

df.loc[stamp]
df.loc[timestamps]

with stamp and timestamps valid timestamp and timestamp range, respectively. I would like to do the same with cudf, but as of v21.08.02, it is impossible.

github-actions[bot] commented 2 years ago

This issue has been labeled inactive-90d due to no recent activity in the past 90 days. Please close this issue if no further response or action is needed. Otherwise, please respond with a comment indicating any updates or changes to the original issue and/or confirm this issue still needs to be addressed.

pbruneau commented 2 years ago

The problems reported above and highlighted in this minimal gist still occur with v21.12, with exactly the same error messages.

wence- commented 1 year ago

An update here (I am working through many indexing corner cases), apologies for the very slow responses.

In 23.06 (the current development version) there is an error constructing time-zone aware timestamps (previously they were accepted but handled incorrectly, now they are not accepted, soon they will be accepted and handled correctly). However, if I remove the timezone portion of the timestamps, then only your last example now fails (I reproduce here for posterity):

import pandas as pd
from datetime import datetime
import cudf
import numpy as np

start = pd.Timestamp(datetime.strptime('2021-03-12 00:00',  '%Y-%m-%d %H:%M'))
end = pd.Timestamp(datetime.strptime('2021-03-12 03:00',  '%Y-%m-%d %H:%M'))
timestamps = pd.date_range(start, end, freq='1H')
labels = ['A', 'B', 'C']
index = pd.MultiIndex.from_product([timestamps, labels], names=["timestamp", "label"])
value = np.random.normal(size=12)
df = pd.DataFrame(value, index=index, columns=['value'])

df_gpu = cudf.from_pandas(df)

start = pd.Timestamp(datetime.strptime('2021-03-12 01:00',  '%Y-%m-%d %H:%M'))
end = pd.Timestamp(datetime.strptime('2021-03-12 02:00',  '%Y-%m-%d %H:%M'))
timestamps = pd.date_range(start, end, freq='1H')

# SUCCEEDS
print(df.loc[timestamps])

# FAILS
print(df_gpu.loc[timestamps])

# indexing with a slice range also fails in this case.
df_gpu.loc[start:end] # Fails
pbruneau commented 1 year ago

Hi @wence-,

I'm installing via Docker, so I can't check out by myself (23.06 does not seem to be available then),

If I get it right, this means that:

print(df_gpu.loc[timestamps])

works fine? I would already have a workaround, then!

wence- commented 1 year ago

Hi @wence-,

I'm installing via Docker, so I can't check out by myself (23.06 does not seem to be available then),

If I get it right, this means that:

print(df_gpu.loc[timestamps])

works fine? I would already have a workaround, then!

If your dataframe has a multiindex, that example does not yet work. If you just have a normal index, it does work.

pbruneau commented 1 year ago

Hi @wence-, I'm installing via Docker, so I can't check out by myself (23.06 does not seem to be available then), If I get it right, this means that:

print(df_gpu.loc[timestamps])

works fine? I would already have a workaround, then!

If your dataframe has a multiindex, that example does not yet work. If you just have a normal index, it does work.

OK! Good luck with the development then (even if luck has nothing to do with it :)