pydata / parallel-tutorial

Parallel computing in Python tutorial materials
300 stars 111 forks source link

Error when installing prep.py - ImportError: No module named '_pandasujson' #20

Closed ahmadia closed 7 years ago

ahmadia commented 7 years ago
Traceback (most recent call last):
  File "prep.py", line 64, in <module>
    dask.compute(values)
  File "/Users/aron/anaconda3/envs/parallel/lib/python3.5/site-packages/dask/base.py", line 204, in compute
    results = get(dsk, keys, **kwargs)
  File "/Users/aron/anaconda3/envs/parallel/lib/python3.5/site-packages/dask/multiprocessing.py", line 177, in get
    raise_exception=reraise, **kwargs)
  File "/Users/aron/anaconda3/envs/parallel/lib/python3.5/site-packages/dask/local.py", line 521, in get_async
    raise_exception(exc, tb)
  File "/Users/aron/anaconda3/envs/parallel/lib/python3.5/site-packages/dask/compatibility.py", line 59, in reraise
    raise exc.with_traceback(tb)
  File "/Users/aron/anaconda3/envs/parallel/lib/python3.5/site-packages/dask/local.py", line 289, in execute_task
    task, data = loads(task_info)
  File "/Users/aron/anaconda3/envs/parallel/lib/python3.5/site-packages/cloudpickle/cloudpickle.py", line 840, in subimport
    __import__(name)
ImportError: No module named '_pandasujson'
ahmadia commented 7 years ago

The problem appears to only be an issue on the latest versions on conda-forge. Removing the conda-forge channel from the environment worked for me.

ahmadia commented 7 years ago

I can confirm the regression is in versions>=0.3.0 of cloudpickle, version 0.2.2 from January of 2017 is fine. I'm going to land a commit backing off to 0.2.2 of cloudpickle, since dask currently seems to be happy with >= 0.2.1.