rapidsai / dask-cuda

Utilities for Dask and CUDA interactions
https://docs.rapids.ai/api/dask-cuda/stable/
Apache License 2.0
290 stars 92 forks source link

[BUG] ImportError: cannot import name 'parse_memory_limit' from 'distributed.worker' #1078

Closed randerzander closed 1 year ago

randerzander commented 1 year ago

After creating a new rapids 23.02 environment, I can't import dask_cuda:

mamba create -y --name pynds -c conda-forge -c rapidsai-nightly -c nvidia python=3.9 cudatoolkit=11.8 cudf=23.02 dask-cudf dask-cuda 'ucx-proc=*=gpu' ucx-py 'rust>=1.59.0' 'setuptools-rust>=1.4.1' dask/label/dev::dask-sql requests

(pynds) rgelhausen@ipp1-3303:~/projects/t$ conda list | grep dask
dask                      2022.12.1          pyhd8ed1ab_0    conda-forge
dask-core                 2022.12.1          pyhd8ed1ab_0    conda-forge
dask-cuda                 0.19.0             pyhd8ed1ab_0    conda-forge
dask-cudf                 23.02.00a230110 cuda_11_py39_g66b846a01d_241    rapidsai-nightly
dask-sql                  2022.12.1a230110 py39_g07db18f_26    dask/label/dev
(pynds) rgelhausen@ipp1-3303:~/projects/$ python
Python 3.9.15 | packaged by conda-forge | (main, Nov 22 2022, 15:55:03) 
[GCC 10.4.0] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> import dask_cuda
/raid/rgelhausen/conda/envs/pynds/lib/python3.9/site-packages/dask_cuda/cuda_worker.py:18: FutureWarning: parse_bytes is deprecated and will be removed in a future release. Please use dask.utils.parse_bytes instead.
  from distributed.utils import parse_bytes
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/raid/rgelhausen/conda/envs/pynds/lib/python3.9/site-packages/dask_cuda/__init__.py", line 5, in <module>
    from .cuda_worker import CUDAWorker
  File "/raid/rgelhausen/conda/envs/pynds/lib/python3.9/site-packages/dask_cuda/cuda_worker.py", line 19, in <module>
    from distributed.worker import parse_memory_limit
ImportError: cannot import name 'parse_memory_limit' from 'distributed.worker' (/raid/rgelhausen/conda/envs/pynds/lib/python3.9/site-packages/distributed/worker.py)
>>>
charlesbluca commented 1 year ago

Think the issue here is that you're pulling in a pretty old version of dask-cuda from conda-forge:

dask-cuda                 0.19.0             pyhd8ed1ab_0    conda-forge

Think this is a product of:

I think in your case, either switching out the channel priority or pinning dask-cuda should be sufficient to resolve this error, though I wonder if this indicates that we should submit a repodata patch for the older dask-cuda packages (cc @jakirkham @pentschev)

EDIT:

Yeah it looks like that's the issue, the older conda-forge dask-cuda packages have different Dask pinnings from their rapidsai equivalent:

→ mamba create -n dask-cuda-1078 python=3.9 dask=2022.12.1 rapidsai::dask-cuda=0.19

Looking for: ['python=3.9', 'dask=2022.12.1', 'rapidsai::dask-cuda=0.19']

conda-forge/linux-64                                        Using cache
conda-forge/noarch                                          Using cache
rapidsai/linux-64                                             No change
rapidsai/noarch                                               No change
Encountered problems while solving:
  - package dask-cuda-0.19.0-py38_0 requires dask >=2.22.0,<=2021.4.0, but none of the providers can be installed

It also looks like they should be restricting Python version as well, as IIUC dask-cuda 0.19 shouldn't support python 3.9.

pentschev commented 1 year ago

As per Charles’ https://github.com/rapidsai/dask-cuda/issues/1078#issuecomment-1377727114, we believe old Dask-CUDA packages in conda-forge are buggy in the sense that they do not pin to any Dask/Distributed versions. IMO, the best course of action is to delete older packages and keep only the latest one (22.12) given that the most recent one before 22.12 was 22.04, meaning that those versions were probably not being used anyway.

@jakirkham you mentioned offline that conda-forge's policy is to do a repodata patch, where's the right place to discuss that? Would it be best in https://github.com/conda-forge/dask-cuda-feedstock or elsewhere?

jakirkham commented 1 year ago

@charlesbluca and I chatted offline about how to handle the repodata patch. As part of this we diffed the repodata for dask-cuda in both channels to find where any constraints were missing. Charles took the results from this effort and submitted PR ( https://github.com/conda-forge/conda-forge-repodata-patches-feedstock/pull/384 ). @pentschev, if you have time to review, would appreciate having your feedback in that PR :)

jakirkham commented 1 year ago

The repodata patch is now in. Would be good to check if that fixed the issue for you, @randerzander

pentschev commented 1 year ago

I tried running the command from the issue description and I see the following package being picked:

  + dask-cuda                           23.02.00a230116  py39_g1149257_32              rapidsai-nightly/linux-64      182kB

With that, I believe this is indeed fixed now. Closing, but please reopen should there side-effects still exist.