sample_data try/except import wrapper fails

michaelaye commented 2 years ago

ALL software version info

hvplot: 0.7.3

Description of expected behavior and the observed behavior

The following import fails, despite the all-catching except in the code?? (Honestly stumped)

from hvplot.sample_data import us_crime, airline_flights

---------------------------------------------------------------------------
ValueError                                Traceback (most recent call last)
/tmp/ipykernel_3185062/1788543639.py in <module>
----> 1 from hvplot.sample_data import us_crime, airline_flights

~/miniconda3/envs/py39/lib/python3.9/site-packages/hvplot/sample_data.py in <module>
     23 # Add catalogue entries to namespace
     24 for _c in catalogue:
---> 25     globals()[_c] = catalogue[_c]

~/miniconda3/envs/py39/lib/python3.9/site-packages/intake/catalog/base.py in __getitem__(self, key)
    398             if e.container == 'catalog':
    399                 return e(name=key)
--> 400             return e()
    401         if isinstance(key, str) and '.' in key:
    402             key = key.split('.')

~/miniconda3/envs/py39/lib/python3.9/site-packages/intake/catalog/entry.py in __call__(self, persist, **kwargs)
     75             raise ValueError('Persist value (%s) not understood' % persist)
     76         persist = persist or self._pmode
---> 77         s = self.get(**kwargs)
     78         if persist != 'never' and isinstance(s, PersistMixin) and s.has_been_persisted:
     79             from ..container.persist import store

~/miniconda3/envs/py39/lib/python3.9/site-packages/intake/catalog/local.py in get(self, **user_parameters)
    287             return self._default_source
    288 
--> 289         plugin, open_args = self._create_open_args(user_parameters)
    290         data_source = plugin(**open_args)
    291         data_source.catalog_object = self._catalog

~/miniconda3/envs/py39/lib/python3.9/site-packages/intake/catalog/local.py in _create_open_args(self, user_parameters)
    261 
    262         if len(self._plugin) == 0:
--> 263             raise ValueError('No plugins loaded for this entry: %s\n'
    264                              'A listing of installable plugins can be found '
    265                              'at https://intake.readthedocs.io/en/latest/plugin'

ValueError: No plugins loaded for this entry: parquet
A listing of installable plugins can be found at https://intake.readthedocs.io/en/latest/plugin-directory.html .

For reference, this is the code in 0.7.3:

import os

try:
    from intake import open_catalog
except:
    raise ImportError('Loading hvPlot sample data requires intake '
                      'and intake-parquet. Install it using conda or '
                      'pip before loading data.')

How can intake throw a ValueError??

Complete, minimal, self-contained example code that reproduces the issue

Have only the package intake installed, no other intake-subpackages.
Execute : from hvplot.sample_data import us_crime, airline_flights

# code goes here between backticks
from hvplot.sample_data import us_crime, airline_flights

Stack traceback and/or browser JavaScript console output

---------------------------------------------------------------------------
ValueError                                Traceback (most recent call last)
/tmp/ipykernel_3185062/1788543639.py in <module>
----> 1 from hvplot.sample_data import us_crime, airline_flights

~/miniconda3/envs/py39/lib/python3.9/site-packages/hvplot/sample_data.py in <module>
     23 # Add catalogue entries to namespace
     24 for _c in catalogue:
---> 25     globals()[_c] = catalogue[_c]

~/miniconda3/envs/py39/lib/python3.9/site-packages/intake/catalog/base.py in __getitem__(self, key)
    398             if e.container == 'catalog':
    399                 return e(name=key)
--> 400             return e()
    401         if isinstance(key, str) and '.' in key:
    402             key = key.split('.')

~/miniconda3/envs/py39/lib/python3.9/site-packages/intake/catalog/entry.py in __call__(self, persist, **kwargs)
     75             raise ValueError('Persist value (%s) not understood' % persist)
     76         persist = persist or self._pmode
---> 77         s = self.get(**kwargs)
     78         if persist != 'never' and isinstance(s, PersistMixin) and s.has_been_persisted:
     79             from ..container.persist import store

~/miniconda3/envs/py39/lib/python3.9/site-packages/intake/catalog/local.py in get(self, **user_parameters)
    287             return self._default_source
    288 
--> 289         plugin, open_args = self._create_open_args(user_parameters)
    290         data_source = plugin(**open_args)
    291         data_source.catalog_object = self._catalog

~/miniconda3/envs/py39/lib/python3.9/site-packages/intake/catalog/local.py in _create_open_args(self, user_parameters)
    261 
    262         if len(self._plugin) == 0:
--> 263             raise ValueError('No plugins loaded for this entry: %s\n'
    264                              'A listing of installable plugins can be found '
    265                              'at https://intake.readthedocs.io/en/latest/plugin'

ValueError: No plugins loaded for this entry: parquet
A listing of installable plugins can be found at https://intake.readthedocs.io/en/latest/plugin-directory.html .

Additional info

The list of required package is now this:

intake-parquet
intake-xarray
s3fs

michaelaye commented 2 years ago

Ah, doh, it's not the import fails, but the code after the import:

24 for _c in catalogue:
---> 25     globals()[_c] = catalogue[_c]

michaelaye commented 2 years ago

I'm having a hard time reproducing this in a notebook due to the usage of __file__, would it be okay for a PR to use importlib.resources to find the path to the datasets.yaml file?

jbednar commented 2 years ago

Maybe add import intake_parquet within the try section to be sure to raise the exception when that plugin is not installed?

michaelaye commented 2 years ago

yes, but if that trial import is acceptable (wasn't sure about performance), then I'd add intake-xarray, and s3fs as well, as those are also required? (Which makes this bug annoying as one needs to try it 3 times, before learning all those 3 missing packages.. . ;)

jbednar commented 2 years ago

I would have thought those would be recursive subdependencies, but if not, then yes, import all those in the try block as well. To make it fail more quickly when it will fail, the first import should be the one most likely to fail (i.e. least likely to be installed in a typical environment), which I'd guess here would be intake_parquet.

michaelaye commented 2 years ago

They seem to be independent packages:

❯ mamba info intake-xarray=0.5

intake-xarray 0.5.0 pyhd8ed1ab_0
--------------------------------
file name   : intake-xarray-0.5.0-pyhd8ed1ab_0.tar.bz2
name        : intake-xarray
version     : 0.5.0
build string: pyhd8ed1ab_0
build number: 0
channel     : https://conda.anaconda.org/conda-forge/noarch
size        : 1.4 MB
arch        : None
constrains  : ()
license     : BSD-2-Clause
license_family: BSD
md5         : 43d9d1c90da0b2b28cc16e58a52a0f2b
noarch      : python
package_type: noarch_python
platform    : None
sha256      : 91a388e5eb015b192bc17de04c55b102576d1c1b08571a80a1a9a1bc6c878f91
subdir      : noarch
timestamp   : 1616085245631
url         : https://conda.anaconda.org/conda-forge/noarch/intake-xarray-0.5.0-pyhd8ed1ab_0.tar.bz2
dependencies:
    dask >=2.2
    intake >=0.5.2
    netcdf4
    python >=3.5
    xarray >=0.12.0
    zarr
WARNING: 'conda info package_name' is deprecated.
          Use 'conda search package_name --info'.

site-packages/hvplot/examples via 🐍 v3.9.9 via 🅒 py39 took 5s 
❯ mamba search intake-parquet
Loading channels: done
# Name                       Version           Build  Channel             
intake-parquet                 0.2.1            py_0  conda-forge         
intake-parquet                 0.2.2            py_0  conda-forge         
intake-parquet                 0.2.3            py_0  conda-forge         

site-packages/hvplot/examples via 🐍 v3.9.9 via 🅒 py39 took 5s 
❯ mamba info intake-parquet=0.2.3

intake-parquet 0.2.3 py_0
-------------------------
file name   : intake-parquet-0.2.3-py_0.tar.bz2
name        : intake-parquet
version     : 0.2.3
build string: py_0
build number: 0
channel     : https://conda.anaconda.org/conda-forge/noarch
size        : 10 KB
arch        : None
constrains  : ()
license     : BSD-2-Clause
license_family: BSD
md5         : b7d04be2fb7b43946cf06dc5f7f04ad1
noarch      : python
package_type: noarch_python
platform    : None
sha256      : 2981d0998aa3e30713c6b2012a4557e77b70ed6e04778f9365c4fdeb593576ca
subdir      : noarch
timestamp   : 1573509119874
url         : https://conda.anaconda.org/conda-forge/noarch/intake-parquet-0.2.3-py_0.tar.bz2
dependencies:
    dask
    fastparquet
    intake >=0.3
    jinja2
    pandas
    pyarrow
    python >=3.5
WARNING: 'conda info package_name' is deprecated.
          Use 'conda search package_name --info'.

and s3fs is obviously unrelated. Will play with it and then submit a PR.

hoxbro commented 2 years ago

Saw the same thing in #562.

I still feel like it is a lot to install just to run the second page in a user guide. Why not just download the data with request or urllib like e.g. bokeh does?

michaelaye commented 2 years ago

well, b/c in the old way, basically everybody is writing a mini-download manager, as one can see from your link. I think relying on intake for data-management is a good thing that should be pushed further. However, I agree this needs to be carefully balanced with tutorial hurdles, which should always be minimized, which is why I reported this as a bug. One should never have a user guide step fail 3 times. Possibly one should simply add the above 4 packages to a user guide prep section?

hoxbro commented 2 years ago

I just think it is a lot to ask for new users to download 4 packages just to get access to a 8 KB file (us-crime) and a 15 MB (airline_flights) file. I just tried to see if I could run the Plotting page from a clean environment:

Created the environment with mamba create -n hvplot_example python=3.8 hvplot jupyterlab

First cell needed to install dask.

Second cell needed to install intake intake-parquet intake-xarray s3fs.

Third cell needed to install IProgress, afterwards it raises a FileNotFoundError? Then I tried to change cell to:

import dask.dataframe as dd
flights = dd.read_parquet("s3://assets.holoviews.org/data/airline_flights.parq").persist()
print(type(flights))
flights.head()

But this gives a NoCredentialsError: Unable to locate credentials. Got this to work by changing s3 to http.

To run the bivariate plot I needed to install scipy.

For the section Large Data to run I needed datashader.

Other things I noticed when trying to get the notebook to work:

The links in hvplot namespace do not work as they should be lowercase.
I don't think you explicit need to run .compute on dask dataframe anymore to use hvplot (but I could be wrong).
The import error for datashader references the name datashading instead of datashader.

I will properly make a PR for 1 and 3 today.

hoxbro commented 2 years ago

@michaelaye could you get this to work?

flights = airline_flights.to_dask().persist()
print(type(flights))
flights.head()

michaelaye commented 2 years ago

yes. which step fails for you? Did you install the missing libraries?

intake-parquet
intake-xarray
s3fs

hoxbro commented 2 years ago

Installed all of them. It just fails with the following message:

Error log

``` python OSError Traceback (most recent call last) Input In [3], in ----> 1 flights = airline_flights.to_dask().persist() 2 print(type(flights)) 3 flights.head() File ~/miniconda3/envs/tmp/lib/python3.8/site-packages/intake_parquet/source.py:99, in ParquetSource.to_dask(self) 98 def to_dask(self): ---> 99 self._load_metadata() 100 return self._df File ~/miniconda3/envs/tmp/lib/python3.8/site-packages/intake/source/base.py:236, in DataSourceBase._load_metadata(self) 234 """load metadata only if needed""" 235 if self._schema is None: --> 236 self._schema = self._get_schema() 237 self.dtype = self._schema.dtype 238 self.shape = self._schema.shape File ~/miniconda3/envs/tmp/lib/python3.8/site-packages/intake_parquet/source.py:60, in ParquetSource._get_schema(self) 58 def _get_schema(self): 59 if self._df is None: ---> 60 self._df = self._to_dask() 61 dtypes = {k: str(v) for k, v in self._df._meta.dtypes.items()} 62 self._schema = base.Schema(datashape=None, 63 dtype=dtypes, 64 shape=(None, len(self._df.columns)), 65 npartitions=self._df.npartitions, 66 extra_metadata={}) File ~/miniconda3/envs/tmp/lib/python3.8/site-packages/intake_parquet/source.py:108, in ParquetSource._to_dask(self) 106 import dask.dataframe as dd 107 urlpath = self._get_cache(self._urlpath)[0] --> 108 self._df = dd.read_parquet(urlpath, 109 storage_options=self._storage_options, **self._kwargs) 110 self._load_metadata() 111 return self._df File ~/miniconda3/envs/tmp/lib/python3.8/site-packages/dask/dataframe/io/parquet/core.py:400, in read_parquet(path, columns, filters, categories, index, storage_options, engine, gather_statistics, ignore_metadata_file, metadata_task_size, split_row_groups, chunksize, aggregate_files, **kwargs) 397 raise ValueError("read_parquet options require gather_statistics=True") 398 gather_statistics = True --> 400 read_metadata_result = engine.read_metadata( 401 fs, 402 paths, 403 categories=categories, 404 index=index, 405 gather_statistics=gather_statistics, 406 filters=filters, 407 split_row_groups=split_row_groups, 408 chunksize=chunksize, 409 aggregate_files=aggregate_files, 410 ignore_metadata_file=ignore_metadata_file, 411 metadata_task_size=metadata_task_size, 412 **kwargs, 413 ) 415 # In the future, we may want to give the engine the 416 # option to return a dedicated element for `common_kwargs`. 417 # However, to avoid breaking the API, we just embed this 418 # data in the first element of `parts` for now. 419 # The logic below is inteded to handle backward and forward 420 # compatibility with a user-defined engine. 421 meta, statistics, parts, index = read_metadata_result[:4] File ~/miniconda3/envs/tmp/lib/python3.8/site-packages/dask/dataframe/io/parquet/fastparquet.py:862, in FastParquetEngine.read_metadata(cls, fs, paths, categories, index, gather_statistics, filters, split_row_groups, chunksize, aggregate_files, ignore_metadata_file, metadata_task_size, **kwargs) 844 @classmethod 845 def read_metadata( 846 cls, (...) 860 861 # Stage 1: Collect general dataset information --> 862 dataset_info = cls._collect_dataset_info( 863 paths, 864 fs, 865 categories, 866 index, 867 gather_statistics, 868 filters, 869 split_row_groups, 870 chunksize, 871 aggregate_files, 872 ignore_metadata_file, 873 metadata_task_size, 874 kwargs, 875 ) 877 # Stage 2: Generate output `meta` 878 meta = cls._create_dd_meta(dataset_info) File ~/miniconda3/envs/tmp/lib/python3.8/site-packages/dask/dataframe/io/parquet/fastparquet.py:473, in FastParquetEngine._collect_dataset_info(cls, paths, fs, categories, index, gather_statistics, filters, split_row_groups, chunksize, aggregate_files, ignore_metadata_file, metadata_task_size, kwargs) 469 else: 470 # Rely on metadata for 0th file. 471 # Will need to pass a list of paths to read_partition 472 scheme = get_file_scheme(fns) --> 473 pf = ParquetFile( 474 paths[:1], open_with=fs.open, root=base, **dataset_kwargs 475 ) 476 pf.file_scheme = scheme 477 pf.cats = paths_to_cats(fns, scheme) File ~/miniconda3/envs/tmp/lib/python3.8/site-packages/fastparquet/api.py:113, in ParquetFile.__init__(self, fn, verify, open_with, root, sep, fs, pandas_nulls) 111 fs = getattr(open_with, "__self__", None) 112 if isinstance(fn, (tuple, list)): --> 113 basepath, fmd = metadata_from_many(fn, verify_schema=verify, 114 open_with=open_with, root=root, 115 fs=fs) 116 self.fn = join_path(basepath, '_metadata') if basepath \ 117 else '_metadata' 118 self.fmd = fmd File ~/miniconda3/envs/tmp/lib/python3.8/site-packages/fastparquet/util.py:179, in metadata_from_many(file_list, verify_schema, open_with, root, fs) 176 elif all(not isinstance(pf, api.ParquetFile) for pf in file_list): 178 if verify_schema or fs is None or len(file_list) < 3: --> 179 pfs = [api.ParquetFile(fn, open_with=open_with) for fn in file_list] 180 else: 181 # activate new code path here 182 f0 = file_list[0] File ~/miniconda3/envs/tmp/lib/python3.8/site-packages/fastparquet/util.py:179, in (.0) 176 elif all(not isinstance(pf, api.ParquetFile) for pf in file_list): 178 if verify_schema or fs is None or len(file_list) < 3: --> 179 pfs = [api.ParquetFile(fn, open_with=open_with) for fn in file_list] 180 else: 181 # activate new code path here 182 f0 = file_list[0] File ~/miniconda3/envs/tmp/lib/python3.8/site-packages/fastparquet/api.py:137, in ParquetFile.__init__(self, fn, verify, open_with, root, sep, fs, pandas_nulls) 135 self.fn = join_path(fn) 136 with open_with(fn, 'rb') as f: --> 137 self._parse_header(f, verify) 138 elif "*" in fn or fs.isdir(fn): 139 fn2 = join_path(fn, '_metadata') File ~/miniconda3/envs/tmp/lib/python3.8/site-packages/fastparquet/api.py:179, in ParquetFile._parse_header(self, f, verify) 177 if verify: 178 assert f.read(4) == b'PAR1' --> 179 f.seek(-8, 2) 180 head_size = struct.unpack('

My environment

``` log # packages in environment at /home/shh/miniconda3/envs/tmp: # # Name Version Build Channel _libgcc_mutex 0.1 conda_forge conda-forge _openmp_mutex 4.5 1_gnu conda-forge aiobotocore 2.1.0 pyhd8ed1ab_0 conda-forge aiohttp 3.8.1 py38h497a2fe_0 conda-forge aioitertools 0.9.0 pyhd8ed1ab_0 conda-forge aiosignal 1.2.0 pyhd8ed1ab_0 conda-forge anyio 3.5.0 py38h578d9bd_0 conda-forge appdirs 1.4.4 pyh9f0ad1d_0 conda-forge argon2-cffi 21.3.0 pyhd8ed1ab_0 conda-forge argon2-cffi-bindings 21.2.0 py38h497a2fe_1 conda-forge arrow-cpp 2.0.0 py38h496fee2_15_cpu conda-forge asciitree 0.3.3 py_2 conda-forge asttokens 2.0.5 pyhd8ed1ab_0 conda-forge async-timeout 4.0.2 pyhd8ed1ab_0 conda-forge attrs 21.4.0 pyhd8ed1ab_0 conda-forge aws-c-common 0.4.59 h36c2ea0_1 conda-forge aws-c-event-stream 0.1.6 had2084c_6 conda-forge aws-checksums 0.1.10 h4e93380_0 conda-forge aws-sdk-cpp 1.8.70 h57dc084_1 conda-forge babel 2.9.1 pyh44b312d_0 conda-forge backcall 0.2.0 pyh9f0ad1d_0 conda-forge backports 1.0 py_2 conda-forge backports.functools_lru_cache 1.6.4 pyhd8ed1ab_0 conda-forge black 22.1.0 pyhd8ed1ab_0 conda-forge bleach 4.1.0 pyhd8ed1ab_0 conda-forge bokeh 2.4.2 py38h578d9bd_0 conda-forge botocore 1.23.24 pyhd8ed1ab_0 conda-forge brotli 1.0.9 h7f98852_6 conda-forge brotli-bin 1.0.9 h7f98852_6 conda-forge brotlipy 0.7.0 py38h497a2fe_1003 conda-forge bzip2 1.0.8 h7f98852_4 conda-forge c-ares 1.18.1 h7f98852_0 conda-forge ca-certificates 2021.10.8 ha878542_0 conda-forge certifi 2021.10.8 py38h578d9bd_1 conda-forge cffi 1.15.0 py38h3931269_0 conda-forge cftime 1.5.2 py38h6c62de6_0 conda-forge charset-normalizer 2.0.12 pyhd8ed1ab_0 conda-forge click 8.0.4 py38h578d9bd_0 conda-forge cloudpickle 2.0.0 pyhd8ed1ab_0 conda-forge colorama 0.4.4 pyh9f0ad1d_0 conda-forge colorcet 3.0.0 py_0 pyviz cramjam 2.5.0 py38ha8db356_0 conda-forge cryptography 36.0.0 py38h9ce1e76_0 curl 7.81.0 h494985f_0 conda-forge cycler 0.11.0 pyhd8ed1ab_0 conda-forge cytoolz 0.11.2 py38h497a2fe_1 conda-forge dask 2022.2.0 pyhd8ed1ab_0 conda-forge dask-core 2022.2.0 pyhd8ed1ab_0 conda-forge dataclasses 0.8 pyhc8e2a94_3 conda-forge datashader 0.13.0 py_0 pyviz datashape 0.5.4 py_1 conda-forge debugpy 1.5.1 py38h709712a_0 conda-forge decorator 5.1.1 pyhd8ed1ab_0 conda-forge defusedxml 0.7.1 pyhd8ed1ab_0 conda-forge distributed 2022.2.0 py38h578d9bd_0 conda-forge entrypoints 0.4 pyhd8ed1ab_0 conda-forge executing 0.8.2 pyhd8ed1ab_0 conda-forge fasteners 0.17.3 pyhd8ed1ab_0 conda-forge fastparquet 0.8.0 py38h6c62de6_1 conda-forge flit-core 3.6.0 pyhd8ed1ab_0 conda-forge freetype 2.10.4 h0708190_1 conda-forge frozenlist 1.3.0 py38h497a2fe_0 conda-forge fsspec 2022.1.0 pyhd8ed1ab_0 conda-forge gflags 2.2.2 he1b5a44_1004 conda-forge glog 0.4.0 h49b9bf7_3 conda-forge grpc-cpp 1.34.1 h2157cd5_4 hdf4 4.2.15 h10796ff_3 conda-forge hdf5 1.12.1 nompi_h7f166f4_103 conda-forge heapdict 1.0.1 py_0 conda-forge holoviews 1.14.8 py_0 pyviz hvplot 0.7.3 py_0 pyviz idna 3.3 pyhd8ed1ab_0 conda-forge importlib-metadata 4.11.1 py38h578d9bd_0 conda-forge importlib_metadata 4.11.1 hd8ed1ab_0 conda-forge importlib_resources 5.4.0 pyhd8ed1ab_0 conda-forge intake 0.6.5 pyhd8ed1ab_0 conda-forge intake-parquet 0.2.3 py_0 conda-forge intake-xarray 0.6.0 pyhd8ed1ab_0 conda-forge iprogress 0.4 py_0 conda-forge ipykernel 6.9.1 py38he5a9106_0 conda-forge ipython 8.0.1 py38h578d9bd_2 conda-forge ipython_genutils 0.2.0 py_1 conda-forge jedi 0.18.1 py38h578d9bd_0 conda-forge jinja2 3.0.3 pyhd8ed1ab_0 conda-forge jmespath 0.10.0 pyh9f0ad1d_0 conda-forge jpeg 9e h7f98852_0 conda-forge json5 0.9.5 pyh9f0ad1d_0 conda-forge jsonschema 4.4.0 pyhd8ed1ab_0 conda-forge jupyter_client 7.1.2 pyhd8ed1ab_0 conda-forge jupyter_core 4.9.2 py38h578d9bd_0 conda-forge jupyter_server 1.13.5 pyhd8ed1ab_1 conda-forge jupyterlab 3.2.9 pyhd8ed1ab_0 conda-forge jupyterlab_pygments 0.1.2 pyh9f0ad1d_0 conda-forge jupyterlab_server 2.10.3 pyhd8ed1ab_0 conda-forge kiwisolver 1.3.2 py38h1fd1430_1 conda-forge krb5 1.19.2 h48eae69_3 conda-forge lcms2 2.12 hddcbb42_0 conda-forge ld_impl_linux-64 2.36.1 hea4e1c9_2 conda-forge libblas 3.9.0 13_linux64_openblas conda-forge libbrotlicommon 1.0.9 h7f98852_6 conda-forge libbrotlidec 1.0.9 h7f98852_6 conda-forge libbrotlienc 1.0.9 h7f98852_6 conda-forge libcblas 3.9.0 13_linux64_openblas conda-forge libcurl 7.81.0 h494985f_0 conda-forge libedit 3.1.20191231 he28a2e2_2 conda-forge libev 4.33 h516909a_1 conda-forge libevent 2.1.10 h28343ad_4 conda-forge libffi 3.4.2 h7f98852_5 conda-forge libgcc-ng 11.2.0 h1d223b6_12 conda-forge libgfortran-ng 11.2.0 h69a702a_12 conda-forge libgfortran5 11.2.0 h5c6108e_12 conda-forge libgomp 11.2.0 h1d223b6_12 conda-forge liblapack 3.9.0 13_linux64_openblas conda-forge libllvm10 10.0.1 he513fc3_3 conda-forge libnetcdf 4.8.1 nompi_hb3fd0d9_101 conda-forge libnghttp2 1.46.0 ha19adfc_0 conda-forge libnsl 2.0.0 h7f98852_0 conda-forge libopenblas 0.3.18 pthreads_h8fe5266_0 conda-forge libpng 1.6.37 h21135ba_2 conda-forge libprotobuf 3.14.0 h780b84a_0 conda-forge libsodium 1.0.18 h36c2ea0_1 conda-forge libssh2 1.10.0 ha35d2d1_2 conda-forge libstdcxx-ng 11.2.0 he4da1e4_12 conda-forge libthrift 0.13.0 hfb8234f_6 libtiff 4.2.0 hbd63e13_2 conda-forge libutf8proc 2.7.0 h7f98852_0 conda-forge libwebp-base 1.2.2 h7f98852_1 conda-forge libzip 1.8.0 h1c5bbd1_1 conda-forge libzlib 1.2.11 h36c2ea0_1013 conda-forge llvmlite 0.36.0 py38h4630a5e_0 conda-forge locket 0.2.0 py_2 conda-forge lz4-c 1.9.3 h9c3ff4c_1 conda-forge markdown 3.3.6 pyhd8ed1ab_0 conda-forge markupsafe 2.1.0 py38h0a891b7_0 conda-forge matplotlib 3.3.2 0 conda-forge matplotlib-base 3.3.2 py38h5c7f4ab_1 conda-forge matplotlib-inline 0.1.3 pyhd8ed1ab_0 conda-forge mistune 0.8.4 py38h497a2fe_1005 conda-forge msgpack-python 1.0.3 py38h1fd1430_0 conda-forge multidict 6.0.2 py38h497a2fe_0 conda-forge multipledispatch 0.6.0 py_0 conda-forge mypy_extensions 0.4.3 py38h578d9bd_4 conda-forge nbclassic 0.3.5 pyhd8ed1ab_0 conda-forge nbclient 0.5.11 pyhd8ed1ab_0 conda-forge nbconvert 6.4.2 py38h578d9bd_0 conda-forge nbformat 5.1.3 pyhd8ed1ab_0 conda-forge ncurses 6.3 h9c3ff4c_0 conda-forge nest-asyncio 1.5.4 pyhd8ed1ab_0 conda-forge netcdf4 1.5.8 nompi_py38h2823cc8_101 conda-forge notebook 6.4.8 pyha770c72_0 conda-forge numba 0.53.1 py38h8b71fd7_1 conda-forge numcodecs 0.9.1 py38h709712a_2 conda-forge numpy 1.22.2 py38h6ae9a64_0 conda-forge olefile 0.46 pyh9f0ad1d_1 conda-forge openjpeg 2.4.0 hb52868f_1 conda-forge openssl 3.0.0 h7f98852_2 conda-forge orc 1.6.6 h7950760_1 conda-forge packaging 21.3 pyhd8ed1ab_0 conda-forge pandas 1.4.1 py38h43a58ef_0 conda-forge pandoc 2.17.1.1 ha770c72_0 conda-forge pandocfilters 1.5.0 pyhd8ed1ab_0 conda-forge panel 0.12.6 py_0 pyviz param 1.12.0 py_0 pyviz parquet-cpp 1.5.1 2 conda-forge parso 0.8.3 pyhd8ed1ab_0 conda-forge partd 1.2.0 pyhd8ed1ab_0 conda-forge pathspec 0.9.0 pyhd8ed1ab_0 conda-forge pexpect 4.8.0 pyh9f0ad1d_2 conda-forge pickleshare 0.7.5 py_1003 conda-forge pillow 8.2.0 py38ha0e1e83_1 conda-forge pip 22.0.3 pyhd8ed1ab_0 conda-forge platformdirs 2.5.1 pyhd8ed1ab_0 conda-forge prometheus_client 0.13.1 pyhd8ed1ab_0 conda-forge prompt-toolkit 3.0.27 pyha770c72_0 conda-forge psutil 5.9.0 py38h497a2fe_0 conda-forge ptyprocess 0.7.0 pyhd3deb0d_0 conda-forge pure_eval 0.2.2 pyhd8ed1ab_0 conda-forge pyarrow 2.0.0 py38h842ea0c_15_cpu conda-forge pycparser 2.21 pyhd8ed1ab_0 conda-forge pyct 0.4.8 py_0 pyviz pyct-core 0.4.8 py_0 pyviz pygments 2.11.2 pyhd8ed1ab_0 conda-forge pyopenssl 22.0.0 pyhd8ed1ab_0 conda-forge pyparsing 3.0.7 pyhd8ed1ab_0 conda-forge pyrsistent 0.18.1 py38h497a2fe_0 conda-forge pysocks 1.7.1 py38h578d9bd_4 conda-forge python 3.8.12 h0744224_3_cpython conda-forge python-dateutil 2.8.2 pyhd8ed1ab_0 conda-forge python_abi 3.8 2_cp38 conda-forge pytz 2021.3 pyhd8ed1ab_0 conda-forge pyviz_comms 2.1.0 py_0 pyviz pyyaml 6.0 py38h497a2fe_3 conda-forge pyzmq 22.3.0 py38h2035c66_1 conda-forge re2 2020.11.01 h58526e2_0 conda-forge readline 8.1 h46c0cb4_0 conda-forge requests 2.27.1 pyhd8ed1ab_0 conda-forge s3fs 2022.1.0 pyhd8ed1ab_0 conda-forge scipy 1.8.0 py38h56a6a73_1 conda-forge send2trash 1.8.0 pyhd8ed1ab_0 conda-forge setuptools 59.8.0 py38h578d9bd_0 conda-forge six 1.16.0 pyh6c4a22f_0 conda-forge snappy 1.1.8 he1b5a44_3 conda-forge sniffio 1.2.0 py38h578d9bd_2 conda-forge sortedcontainers 2.4.0 pyhd8ed1ab_0 conda-forge sqlite 3.37.0 h9cd32fc_0 conda-forge stack_data 0.2.0 pyhd8ed1ab_0 conda-forge tblib 1.7.0 pyhd8ed1ab_0 conda-forge terminado 0.13.1 py38h578d9bd_0 conda-forge testpath 0.5.0 pyhd8ed1ab_0 conda-forge tk 8.6.12 h27826a3_0 conda-forge tomli 2.0.1 pyhd8ed1ab_0 conda-forge toolz 0.11.2 pyhd8ed1ab_0 conda-forge tornado 6.1 py38h497a2fe_2 conda-forge tqdm 4.62.3 pyhd8ed1ab_0 conda-forge traitlets 5.1.1 pyhd8ed1ab_0 conda-forge typed-ast 1.5.2 py38h497a2fe_0 conda-forge typing-extensions 4.1.1 hd8ed1ab_0 conda-forge typing_extensions 4.1.1 pyha770c72_0 conda-forge urllib3 1.26.8 pyhd8ed1ab_1 conda-forge wcwidth 0.2.5 pyh9f0ad1d_2 conda-forge webencodings 0.5.1 py_1 conda-forge websocket-client 1.2.3 pyhd8ed1ab_0 conda-forge wheel 0.37.1 pyhd8ed1ab_0 conda-forge wrapt 1.13.3 py38h497a2fe_1 conda-forge xarray 0.21.1 pyhd8ed1ab_0 conda-forge xz 5.2.5 h516909a_1 conda-forge yaml 0.2.5 h7f98852_2 conda-forge yarl 1.7.2 py38h497a2fe_1 conda-forge zarr 2.11.0 pyhd8ed1ab_0 conda-forge zeromq 4.3.4 h9c3ff4c_1 conda-forge zict 2.0.0 py_0 conda-forge zipp 3.7.0 pyhd8ed1ab_1 conda-forge zlib 1.2.11 h36c2ea0_1013 conda-forge zstd 1.4.9 ha95c52a_0 conda-forge ```

Created with mamba create -n tmp python=3.8 hvplot jupyterlab dask intake intake-parquet intake-xarray s3fs IProgress scipy datashader

michaelaye commented 2 years ago

Your information is inconsistent. An environment created with that mamba conmmand does not end up having pyviz/holoviews packages from the pyviz channel, unless you put the pyviz channel in the base conda config higher in conda priority than conda-forge? I wouldn't do that as I believe packages in conda-forge receive a better fit-to-all check than packages from outside entities.

What is the output of conda config --show-sources ? What is your OS?

I just did mamba create -n tmp python=3.8 hvplot jupyterlab dask intake intake-parquet intake-xarray s3fs IProgress scipy datashader on my linux machine, conda activate tmp, and then

from hvplot.sample_data import airline_flights
flights = airline_flights.to_dask().persist()
print(flights.compute().head())

without any issue.

I noticed that the environment created by above mamba command had several newer packages than yours (libthrift and grpc-cpp) but also some older (pyct and pyct-core are 0.4.6 in my tmp env). I'll try Mac now to see what I get there.

michaelaye commented 2 years ago

wow, just adding conda config --add channels pyviz makes mamba not even be able to find a working set, that's how bad it is to add the pyviz channel. I cleanly work from conda-forge only with very very few exceptions for things only being on pypi and have never (or very rarely) see mamba fail to resolve things.

michaelaye commented 2 years ago

No issues on my Mac either. Here's what I get when I do:

 conda config --show-sources
==> /home/maye/miniconda3/.condarc <==
channels:
  - conda-forge

==> /home/maye/.condarc <==
channel_priority: strict
channels: []
report_errors: True

michaelaye commented 2 years ago

Ok, i managed to create an environment very similar to yours by adding pyviz as a channel and setting my channel_priority to flexible. However, everything works fine still with the hvplot airline data.

It might be time to blast your whole miniconda folder for having some rotten libraries somewhere. I need to do that myself maybe once a year when something gets corrupted for being too adventurous with installing stuff.

michaelaye commented 2 years ago

Here's my env, for cross-checking:

My tmp env

``` # packages in environment at /home/maye/miniconda3/envs/tmp: # # Name Version Build Channel _libgcc_mutex 0.1 conda_forge conda-forge _openmp_mutex 4.5 1_gnu conda-forge abseil-cpp 20210324.2 h9c3ff4c_0 conda-forge aiobotocore 2.1.0 pyhd8ed1ab_0 conda-forge aiohttp 3.8.1 py38h497a2fe_0 conda-forge aioitertools 0.9.0 pyhd8ed1ab_0 conda-forge aiosignal 1.2.0 pyhd8ed1ab_0 conda-forge alsa-lib 1.2.3 h516909a_0 conda-forge anyio 3.5.0 py38h578d9bd_0 conda-forge appdirs 1.4.4 pyh9f0ad1d_0 conda-forge argon2-cffi 21.3.0 pyhd8ed1ab_0 conda-forge argon2-cffi-bindings 21.2.0 py38h497a2fe_1 conda-forge arrow-cpp 7.0.0 py38hdbd6c21_2_cpu conda-forge asciitree 0.3.3 py_2 conda-forge asttokens 2.0.5 pyhd8ed1ab_0 conda-forge async-timeout 4.0.2 pyhd8ed1ab_0 conda-forge attrs 21.4.0 pyhd8ed1ab_0 conda-forge aws-c-cal 0.5.11 h95a6274_0 conda-forge aws-c-common 0.6.2 h7f98852_0 conda-forge aws-c-event-stream 0.2.7 h3541f99_13 conda-forge aws-c-io 0.10.5 hfb6a706_0 conda-forge aws-checksums 0.1.11 ha31a3da_7 conda-forge aws-sdk-cpp 1.8.186 hb4091e7_3 conda-forge babel 2.9.1 pyh44b312d_0 conda-forge backcall 0.2.0 pyh9f0ad1d_0 conda-forge backports 1.0 py_2 conda-forge backports.functools_lru_cache 1.6.4 pyhd8ed1ab_0 conda-forge black 22.1.0 pyhd8ed1ab_0 conda-forge bleach 4.1.0 pyhd8ed1ab_0 conda-forge bokeh 2.4.2 py38h578d9bd_0 conda-forge botocore 1.23.24 pyhd8ed1ab_0 conda-forge brotli 1.0.9 h7f98852_6 conda-forge brotli-bin 1.0.9 h7f98852_6 conda-forge brotlipy 0.7.0 py38h497a2fe_1003 conda-forge bzip2 1.0.8 h7f98852_4 conda-forge c-ares 1.18.1 h7f98852_0 conda-forge ca-certificates 2021.10.8 ha878542_0 conda-forge certifi 2021.10.8 py38h578d9bd_1 conda-forge cffi 1.15.0 py38h3931269_0 conda-forge cftime 1.5.2 py38h6c62de6_0 conda-forge charset-normalizer 2.0.12 pyhd8ed1ab_0 conda-forge click 8.0.4 py38h578d9bd_0 conda-forge cloudpickle 2.0.0 pyhd8ed1ab_0 conda-forge colorama 0.4.4 pyh9f0ad1d_0 conda-forge colorcet 3.0.0 py_0 pyviz cramjam 2.5.0 py38ha8db356_0 conda-forge cryptography 36.0.1 py38h3e25421_0 conda-forge curl 7.81.0 h2574ce0_0 conda-forge cycler 0.11.0 pyhd8ed1ab_0 conda-forge cytoolz 0.11.2 py38h497a2fe_1 conda-forge dask 2022.2.0 pyhd8ed1ab_0 conda-forge dask-core 2022.2.0 pyhd8ed1ab_0 conda-forge dataclasses 0.8 pyhc8e2a94_3 conda-forge datashader 0.13.0 py_0 pyviz datashape 0.5.4 py_1 conda-forge dbus 1.13.6 h5008d03_3 conda-forge debugpy 1.5.1 py38h709712a_0 conda-forge decorator 5.1.1 pyhd8ed1ab_0 conda-forge defusedxml 0.7.1 pyhd8ed1ab_0 conda-forge distributed 2022.2.0 py38h578d9bd_0 conda-forge entrypoints 0.4 pyhd8ed1ab_0 conda-forge executing 0.8.2 pyhd8ed1ab_0 conda-forge expat 2.4.4 h9c3ff4c_0 conda-forge fasteners 0.17.3 pyhd8ed1ab_0 conda-forge fastparquet 0.8.0 py38h6c62de6_1 conda-forge flit-core 3.6.0 pyhd8ed1ab_0 conda-forge font-ttf-dejavu-sans-mono 2.37 hab24e00_0 conda-forge font-ttf-inconsolata 3.000 h77eed37_0 conda-forge font-ttf-source-code-pro 2.038 h77eed37_0 conda-forge font-ttf-ubuntu 0.83 hab24e00_0 conda-forge fontconfig 2.13.96 ha180cfb_0 conda-forge fonts-conda-ecosystem 1 0 conda-forge fonts-conda-forge 1 0 conda-forge fonttools 4.29.1 py38h497a2fe_0 conda-forge freetype 2.10.4 h0708190_1 conda-forge frozenlist 1.3.0 py38h497a2fe_0 conda-forge fsspec 2022.1.0 pyhd8ed1ab_0 conda-forge gettext 0.19.8.1 h73d1719_1008 conda-forge gflags 2.2.2 he1b5a44_1004 conda-forge giflib 5.2.1 h36c2ea0_2 conda-forge glog 0.5.0 h48cff8f_0 conda-forge grpc-cpp 1.43.2 h9e046d8_1 conda-forge gst-plugins-base 1.18.5 hf529b03_3 conda-forge gstreamer 1.18.5 h9f60fe5_3 conda-forge hdf4 4.2.15 h10796ff_3 conda-forge hdf5 1.12.1 nompi_h2750804_103 conda-forge heapdict 1.0.1 py_0 conda-forge holoviews 1.14.8 py_0 pyviz hvplot 0.7.3 py_0 pyviz icu 69.1 h9c3ff4c_0 conda-forge idna 3.3 pyhd8ed1ab_0 conda-forge importlib-metadata 4.11.1 py38h578d9bd_0 conda-forge importlib_metadata 4.11.1 hd8ed1ab_0 conda-forge importlib_resources 5.4.0 pyhd8ed1ab_0 conda-forge intake 0.6.5 pyhd8ed1ab_0 conda-forge intake-parquet 0.2.3 py_0 conda-forge intake-xarray 0.6.0 pyhd8ed1ab_0 conda-forge iprogress 0.4 py_0 conda-forge ipykernel 6.9.1 py38he5a9106_0 conda-forge ipython 8.0.1 py38h578d9bd_2 conda-forge ipython_genutils 0.2.0 py_1 conda-forge jbig 2.1 h7f98852_2003 conda-forge jedi 0.18.1 py38h578d9bd_0 conda-forge jinja2 3.0.3 pyhd8ed1ab_0 conda-forge jmespath 0.10.0 pyh9f0ad1d_0 conda-forge jpeg 9e h7f98852_0 conda-forge json5 0.9.5 pyh9f0ad1d_0 conda-forge jsonschema 4.4.0 pyhd8ed1ab_0 conda-forge jupyter_client 7.1.2 pyhd8ed1ab_0 conda-forge jupyter_core 4.9.2 py38h578d9bd_0 conda-forge jupyter_server 1.13.5 pyhd8ed1ab_1 conda-forge jupyterlab 3.2.9 pyhd8ed1ab_0 conda-forge jupyterlab_pygments 0.1.2 pyh9f0ad1d_0 conda-forge jupyterlab_server 2.10.3 pyhd8ed1ab_0 conda-forge kiwisolver 1.3.2 py38h1fd1430_1 conda-forge krb5 1.19.2 hcc1bbae_3 conda-forge lcms2 2.12 hddcbb42_0 conda-forge ld_impl_linux-64 2.36.1 hea4e1c9_2 conda-forge lerc 3.0 h9c3ff4c_0 conda-forge libblas 3.9.0 13_linux64_openblas conda-forge libbrotlicommon 1.0.9 h7f98852_6 conda-forge libbrotlidec 1.0.9 h7f98852_6 conda-forge libbrotlienc 1.0.9 h7f98852_6 conda-forge libcblas 3.9.0 13_linux64_openblas conda-forge libclang 13.0.1 default_hc23dcda_0 conda-forge libcrc32c 1.1.2 h9c3ff4c_0 conda-forge libcurl 7.81.0 h2574ce0_0 conda-forge libdeflate 1.10 h7f98852_0 conda-forge libedit 3.1.20191231 he28a2e2_2 conda-forge libev 4.33 h516909a_1 conda-forge libevent 2.1.10 h9b69904_4 conda-forge libffi 3.4.2 h7f98852_5 conda-forge libgcc-ng 11.2.0 h1d223b6_12 conda-forge libgfortran-ng 11.2.0 h69a702a_12 conda-forge libgfortran5 11.2.0 h5c6108e_12 conda-forge libglib 2.70.2 h174f98d_4 conda-forge libgomp 11.2.0 h1d223b6_12 conda-forge libgoogle-cloud 1.35.0 h6945097_2 conda-forge libiconv 1.16 h516909a_0 conda-forge liblapack 3.9.0 13_linux64_openblas conda-forge libllvm10 10.0.1 he513fc3_3 conda-forge libllvm13 13.0.1 hf817b99_0 conda-forge libnetcdf 4.8.1 nompi_hb3fd0d9_101 conda-forge libnghttp2 1.46.0 h812cca2_0 conda-forge libnsl 2.0.0 h7f98852_0 conda-forge libogg 1.3.4 h7f98852_1 conda-forge libopenblas 0.3.18 pthreads_h8fe5266_0 conda-forge libopus 1.3.1 h7f98852_1 conda-forge libpng 1.6.37 h21135ba_2 conda-forge libpq 14.2 hd57d9b9_0 conda-forge libprotobuf 3.19.4 h780b84a_0 conda-forge libsodium 1.0.18 h36c2ea0_1 conda-forge libssh2 1.10.0 ha56f1ee_2 conda-forge libstdcxx-ng 11.2.0 he4da1e4_12 conda-forge libthrift 0.15.0 he6d91bd_1 conda-forge libtiff 4.3.0 h542a066_3 conda-forge libutf8proc 2.7.0 h7f98852_0 conda-forge libuuid 2.32.1 h7f98852_1000 conda-forge libvorbis 1.3.7 h9c3ff4c_0 conda-forge libwebp 1.2.2 h3452ae3_0 conda-forge libwebp-base 1.2.2 h7f98852_1 conda-forge libxcb 1.13 h7f98852_1004 conda-forge libxkbcommon 1.0.3 he3ba5ed_0 conda-forge libxml2 2.9.12 h885dcf4_1 conda-forge libzip 1.8.0 h4de3113_1 conda-forge libzlib 1.2.11 h36c2ea0_1013 conda-forge llvmlite 0.36.0 py38h4630a5e_0 conda-forge locket 0.2.0 py_2 conda-forge lz4-c 1.9.3 h9c3ff4c_1 conda-forge markdown 3.3.6 pyhd8ed1ab_0 conda-forge markupsafe 2.1.0 py38h0a891b7_0 conda-forge matplotlib 3.5.1 py38h578d9bd_0 conda-forge matplotlib-base 3.5.1 py38hf4fb855_0 conda-forge matplotlib-inline 0.1.3 pyhd8ed1ab_0 conda-forge mistune 0.8.4 py38h497a2fe_1005 conda-forge msgpack-python 1.0.3 py38h1fd1430_0 conda-forge multidict 6.0.2 py38h497a2fe_0 conda-forge multipledispatch 0.6.0 py_0 conda-forge munkres 1.1.4 pyh9f0ad1d_0 conda-forge mypy_extensions 0.4.3 py38h578d9bd_4 conda-forge mysql-common 8.0.28 ha770c72_0 conda-forge mysql-libs 8.0.28 hfa10184_0 conda-forge nbclassic 0.3.5 pyhd8ed1ab_0 conda-forge nbclient 0.5.11 pyhd8ed1ab_0 conda-forge nbconvert 6.4.2 py38h578d9bd_0 conda-forge nbformat 5.1.3 pyhd8ed1ab_0 conda-forge ncurses 6.3 h9c3ff4c_0 conda-forge nest-asyncio 1.5.4 pyhd8ed1ab_0 conda-forge netcdf4 1.5.8 nompi_py38h2823cc8_101 conda-forge notebook 6.4.8 pyha770c72_0 conda-forge nspr 4.32 h9c3ff4c_1 conda-forge nss 3.74 hb5efdd6_0 conda-forge numba 0.53.1 py38h8b71fd7_1 conda-forge numcodecs 0.9.1 py38h709712a_2 conda-forge numpy 1.22.2 py38h6ae9a64_0 conda-forge openjpeg 2.4.0 hb52868f_1 conda-forge openssl 1.1.1l h7f98852_0 conda-forge orc 1.7.3 h1be678f_0 conda-forge packaging 21.3 pyhd8ed1ab_0 conda-forge pandas 1.4.1 py38h43a58ef_0 conda-forge pandoc 2.17.1.1 ha770c72_0 conda-forge pandocfilters 1.5.0 pyhd8ed1ab_0 conda-forge panel 0.12.6 py_0 pyviz param 1.12.0 py_0 pyviz parquet-cpp 1.5.1 2 conda-forge parso 0.8.3 pyhd8ed1ab_0 conda-forge partd 1.2.0 pyhd8ed1ab_0 conda-forge pathspec 0.9.0 pyhd8ed1ab_0 conda-forge pcre 8.45 h9c3ff4c_0 conda-forge pexpect 4.8.0 pyh9f0ad1d_2 conda-forge pickleshare 0.7.5 py_1003 conda-forge pillow 9.0.1 py38h0ee0e06_2 conda-forge pip 22.0.3 pyhd8ed1ab_0 conda-forge platformdirs 2.5.1 pyhd8ed1ab_0 conda-forge prometheus_client 0.13.1 pyhd8ed1ab_0 conda-forge prompt-toolkit 3.0.27 pyha770c72_0 conda-forge psutil 5.9.0 py38h497a2fe_0 conda-forge pthread-stubs 0.4 h36c2ea0_1001 conda-forge ptyprocess 0.7.0 pyhd3deb0d_0 conda-forge pure_eval 0.2.2 pyhd8ed1ab_0 conda-forge pyarrow 7.0.0 py38he7e5f7d_2_cpu conda-forge pycparser 2.21 pyhd8ed1ab_0 conda-forge pyct 0.4.8 py_0 pyviz pyct-core 0.4.8 py_0 pyviz pygments 2.11.2 pyhd8ed1ab_0 conda-forge pyopenssl 22.0.0 pyhd8ed1ab_0 conda-forge pyparsing 3.0.7 pyhd8ed1ab_0 conda-forge pyqt 5.12.3 py38h578d9bd_8 conda-forge pyqt-impl 5.12.3 py38h0ffb2e6_8 conda-forge pyqt5-sip 4.19.18 py38h709712a_8 conda-forge pyqtchart 5.12 py38h7400c14_8 conda-forge pyqtwebengine 5.12.1 py38h7400c14_8 conda-forge pyrsistent 0.18.1 py38h497a2fe_0 conda-forge pysocks 1.7.1 py38h578d9bd_4 conda-forge python 3.8.12 ha38a3c6_3_cpython conda-forge python-dateutil 2.8.2 pyhd8ed1ab_0 conda-forge python_abi 3.8 2_cp38 conda-forge pytz 2021.3 pyhd8ed1ab_0 conda-forge pyviz_comms 2.1.0 py_0 pyviz pyyaml 6.0 py38h497a2fe_3 conda-forge pyzmq 22.3.0 py38h2035c66_1 conda-forge qt 5.12.9 ha98a1a1_5 conda-forge re2 2022.02.01 h9c3ff4c_0 conda-forge readline 8.1 h46c0cb4_0 conda-forge requests 2.27.1 pyhd8ed1ab_0 conda-forge s2n 1.0.10 h9b69904_0 conda-forge s3fs 2022.1.0 pyhd8ed1ab_0 conda-forge scipy 1.8.0 py38h56a6a73_1 conda-forge send2trash 1.8.0 pyhd8ed1ab_0 conda-forge setuptools 59.8.0 py38h578d9bd_0 conda-forge six 1.16.0 pyh6c4a22f_0 conda-forge snappy 1.1.8 he1b5a44_3 conda-forge sniffio 1.2.0 py38h578d9bd_2 conda-forge sortedcontainers 2.4.0 pyhd8ed1ab_0 conda-forge sqlite 3.37.0 h9cd32fc_0 conda-forge stack_data 0.2.0 pyhd8ed1ab_0 conda-forge tblib 1.7.0 pyhd8ed1ab_0 conda-forge terminado 0.13.1 py38h578d9bd_0 conda-forge testpath 0.5.0 pyhd8ed1ab_0 conda-forge tk 8.6.12 h27826a3_0 conda-forge tomli 2.0.1 pyhd8ed1ab_0 conda-forge toolz 0.11.2 pyhd8ed1ab_0 conda-forge tornado 6.1 py38h497a2fe_2 conda-forge tqdm 4.62.3 pyhd8ed1ab_0 conda-forge traitlets 5.1.1 pyhd8ed1ab_0 conda-forge typed-ast 1.5.2 py38h497a2fe_0 conda-forge typing-extensions 4.1.1 hd8ed1ab_0 conda-forge typing_extensions 4.1.1 pyha770c72_0 conda-forge unicodedata2 14.0.0 py38h497a2fe_0 conda-forge urllib3 1.26.8 pyhd8ed1ab_1 conda-forge wcwidth 0.2.5 pyh9f0ad1d_2 conda-forge webencodings 0.5.1 py_1 conda-forge websocket-client 1.2.3 pyhd8ed1ab_0 conda-forge wheel 0.37.1 pyhd8ed1ab_0 conda-forge wrapt 1.13.3 py38h497a2fe_1 conda-forge xarray 0.21.1 pyhd8ed1ab_0 conda-forge xorg-libxau 1.0.9 h7f98852_0 conda-forge xorg-libxdmcp 1.1.3 h7f98852_0 conda-forge xz 5.2.5 h516909a_1 conda-forge yaml 0.2.5 h7f98852_2 conda-forge yarl 1.7.2 py38h497a2fe_1 conda-forge zarr 2.11.0 pyhd8ed1ab_0 conda-forge zeromq 4.3.4 h9c3ff4c_1 conda-forge zict 2.0.0 py_0 conda-forge zipp 3.7.0 pyhd8ed1ab_1 conda-forge zlib 1.2.11 h36c2ea0_1013 conda-forge zstd 1.5.2 ha95c52a_0 conda-forge ```

Some observations:

You seem to be on linux as well (path: /home/shh/...). So why did my env get the abseil-cpp package and your's didn't, with the apparent same mamba command? (There might be other differences, I didn't compare the whole list.)
If a seek on a file structure fails (the last error detail in your traceback), doesn't that usually means that the file is corrupt? So the question I have then, how can one force a refresh of a cached file that might be corrupt?

hoxbro commented 2 years ago

Thank you for the investigation.

Yes you are correct that I was not "telling the truth" about my setup - I forgot to add my .condarc file. But as you concluded it did contain pyviz in my channels.

I have removed pyviz and default from my .condarc and have removed and reinstalled miniconda, but I still get an error FileNotFoundError (like I originally did).

➜ conda config --show-sources
==> /home/shh/miniconda3/.condarc <==
channels:
  - conda-forge

==> /home/shh/.condarc <==
changeps1: False
ssl_verify: True
channel_priority: strict
channels:
  - conda-forge

Environment

``` yaml # packages in environment at /home/shh/miniconda3/envs/tmp: # # Name Version Build Channel _libgcc_mutex 0.1 conda_forge conda-forge _openmp_mutex 4.5 1_gnu conda-forge abseil-cpp 20210324.2 h9c3ff4c_0 conda-forge aiobotocore 2.1.0 pyhd8ed1ab_0 conda-forge aiohttp 3.8.1 py38h497a2fe_0 conda-forge aioitertools 0.9.0 pyhd8ed1ab_0 conda-forge aiosignal 1.2.0 pyhd8ed1ab_0 conda-forge anyio 3.5.0 py38h578d9bd_0 conda-forge appdirs 1.4.4 pyh9f0ad1d_0 conda-forge argon2-cffi 21.3.0 pyhd8ed1ab_0 conda-forge argon2-cffi-bindings 21.2.0 py38h497a2fe_1 conda-forge arrow-cpp 7.0.0 py38hdbd6c21_2_cpu conda-forge asciitree 0.3.3 py_2 conda-forge asttokens 2.0.5 pyhd8ed1ab_0 conda-forge async-timeout 4.0.2 pyhd8ed1ab_0 conda-forge attrs 21.4.0 pyhd8ed1ab_0 conda-forge aws-c-cal 0.5.11 h95a6274_0 conda-forge aws-c-common 0.6.2 h7f98852_0 conda-forge aws-c-event-stream 0.2.7 h3541f99_13 conda-forge aws-c-io 0.10.5 hfb6a706_0 conda-forge aws-checksums 0.1.11 ha31a3da_7 conda-forge aws-sdk-cpp 1.8.186 hb4091e7_3 conda-forge babel 2.9.1 pyh44b312d_0 conda-forge backcall 0.2.0 pyh9f0ad1d_0 conda-forge backports 1.0 py_2 conda-forge backports.functools_lru_cache 1.6.4 pyhd8ed1ab_0 conda-forge black 22.1.0 pyhd8ed1ab_0 conda-forge bleach 4.1.0 pyhd8ed1ab_0 conda-forge bokeh 2.4.2 py38h578d9bd_0 conda-forge botocore 1.23.24 pyhd8ed1ab_0 conda-forge brotli 1.0.9 h7f98852_6 conda-forge brotli-bin 1.0.9 h7f98852_6 conda-forge brotlipy 0.7.0 py38h497a2fe_1003 conda-forge bzip2 1.0.8 h7f98852_4 conda-forge c-ares 1.18.1 h7f98852_0 conda-forge ca-certificates 2021.10.8 ha878542_0 conda-forge certifi 2021.10.8 py38h578d9bd_1 conda-forge cffi 1.15.0 py38h3931269_0 conda-forge cftime 1.5.2 py38h6c62de6_0 conda-forge charset-normalizer 2.0.12 pyhd8ed1ab_0 conda-forge click 8.0.4 py38h578d9bd_0 conda-forge cloudpickle 2.0.0 pyhd8ed1ab_0 conda-forge colorama 0.4.4 pyh9f0ad1d_0 conda-forge colorcet 3.0.0 pyhd8ed1ab_0 conda-forge cramjam 2.5.0 py38ha8db356_0 conda-forge cryptography 36.0.1 py38h3e25421_0 conda-forge curl 7.81.0 h2574ce0_0 conda-forge cycler 0.11.0 pyhd8ed1ab_0 conda-forge cytoolz 0.11.2 py38h497a2fe_1 conda-forge dask 2022.2.0 pyhd8ed1ab_0 conda-forge dask-core 2022.2.0 pyhd8ed1ab_0 conda-forge dataclasses 0.8 pyhc8e2a94_3 conda-forge datashader 0.13.0 pyh6c4a22f_0 conda-forge datashape 0.5.4 py_1 conda-forge debugpy 1.5.1 py38h709712a_0 conda-forge decorator 5.1.1 pyhd8ed1ab_0 conda-forge defusedxml 0.7.1 pyhd8ed1ab_0 conda-forge distributed 2022.2.0 py38h578d9bd_0 conda-forge entrypoints 0.4 pyhd8ed1ab_0 conda-forge executing 0.8.2 pyhd8ed1ab_0 conda-forge fasteners 0.17.3 pyhd8ed1ab_0 conda-forge fastparquet 0.8.0 py38h6c62de6_1 conda-forge flit-core 3.6.0 pyhd8ed1ab_0 conda-forge fonttools 4.29.1 py38h497a2fe_0 conda-forge freetype 2.10.4 h0708190_1 conda-forge frozenlist 1.3.0 py38h497a2fe_0 conda-forge fsspec 2022.1.0 pyhd8ed1ab_0 conda-forge gflags 2.2.2 he1b5a44_1004 conda-forge giflib 5.2.1 h36c2ea0_2 conda-forge glog 0.5.0 h48cff8f_0 conda-forge grpc-cpp 1.43.2 h9e046d8_1 conda-forge hdf4 4.2.15 h10796ff_3 conda-forge hdf5 1.12.1 nompi_h2750804_103 conda-forge heapdict 1.0.1 py_0 conda-forge holoviews 1.14.8 pyhd8ed1ab_0 conda-forge hvplot 0.7.3 pyh6c4a22f_0 conda-forge idna 3.3 pyhd8ed1ab_0 conda-forge importlib-metadata 4.11.1 py38h578d9bd_0 conda-forge importlib_metadata 4.11.1 hd8ed1ab_0 conda-forge importlib_resources 5.4.0 pyhd8ed1ab_0 conda-forge intake 0.6.5 pyhd8ed1ab_0 conda-forge intake-parquet 0.2.3 py_0 conda-forge intake-xarray 0.6.0 pyhd8ed1ab_0 conda-forge iprogress 0.4 py_0 conda-forge ipykernel 6.9.1 py38he5a9106_0 conda-forge ipython 8.0.1 py38h578d9bd_2 conda-forge ipython_genutils 0.2.0 py_1 conda-forge jbig 2.1 h7f98852_2003 conda-forge jedi 0.18.1 py38h578d9bd_0 conda-forge jinja2 3.0.3 pyhd8ed1ab_0 conda-forge jmespath 0.10.0 pyh9f0ad1d_0 conda-forge jpeg 9e h7f98852_0 conda-forge json5 0.9.5 pyh9f0ad1d_0 conda-forge jsonschema 4.4.0 pyhd8ed1ab_0 conda-forge jupyter_client 7.1.2 pyhd8ed1ab_0 conda-forge jupyter_core 4.9.2 py38h578d9bd_0 conda-forge jupyter_server 1.13.5 pyhd8ed1ab_1 conda-forge jupyterlab 3.2.9 pyhd8ed1ab_0 conda-forge jupyterlab_pygments 0.1.2 pyh9f0ad1d_0 conda-forge jupyterlab_server 2.10.3 pyhd8ed1ab_0 conda-forge kiwisolver 1.3.2 py38h1fd1430_1 conda-forge krb5 1.19.2 hcc1bbae_3 conda-forge lcms2 2.12 hddcbb42_0 conda-forge ld_impl_linux-64 2.36.1 hea4e1c9_2 conda-forge lerc 3.0 h9c3ff4c_0 conda-forge libblas 3.9.0 13_linux64_openblas conda-forge libbrotlicommon 1.0.9 h7f98852_6 conda-forge libbrotlidec 1.0.9 h7f98852_6 conda-forge libbrotlienc 1.0.9 h7f98852_6 conda-forge libcblas 3.9.0 13_linux64_openblas conda-forge libcrc32c 1.1.2 h9c3ff4c_0 conda-forge libcurl 7.81.0 h2574ce0_0 conda-forge libdeflate 1.10 h7f98852_0 conda-forge libedit 3.1.20191231 he28a2e2_2 conda-forge libev 4.33 h516909a_1 conda-forge libevent 2.1.10 h9b69904_4 conda-forge libffi 3.4.2 h7f98852_5 conda-forge libgcc-ng 11.2.0 h1d223b6_12 conda-forge libgfortran-ng 11.2.0 h69a702a_12 conda-forge libgfortran5 11.2.0 h5c6108e_12 conda-forge libgomp 11.2.0 h1d223b6_12 conda-forge libgoogle-cloud 1.35.0 h6945097_2 conda-forge liblapack 3.9.0 13_linux64_openblas conda-forge libllvm10 10.0.1 he513fc3_3 conda-forge libnetcdf 4.8.1 nompi_hb3fd0d9_101 conda-forge libnghttp2 1.46.0 h812cca2_0 conda-forge libnsl 2.0.0 h7f98852_0 conda-forge libopenblas 0.3.18 pthreads_h8fe5266_0 conda-forge libpng 1.6.37 h21135ba_2 conda-forge libprotobuf 3.19.4 h780b84a_0 conda-forge libsodium 1.0.18 h36c2ea0_1 conda-forge libssh2 1.10.0 ha56f1ee_2 conda-forge libstdcxx-ng 11.2.0 he4da1e4_12 conda-forge libthrift 0.15.0 he6d91bd_1 conda-forge libtiff 4.3.0 h542a066_3 conda-forge libutf8proc 2.7.0 h7f98852_0 conda-forge libwebp 1.2.2 h3452ae3_0 conda-forge libwebp-base 1.2.2 h7f98852_1 conda-forge libxcb 1.13 h7f98852_1004 conda-forge libzip 1.8.0 h4de3113_1 conda-forge libzlib 1.2.11 h36c2ea0_1013 conda-forge llvmlite 0.36.0 py38h4630a5e_0 conda-forge locket 0.2.0 py_2 conda-forge lz4-c 1.9.3 h9c3ff4c_1 conda-forge markdown 3.3.6 pyhd8ed1ab_0 conda-forge markupsafe 2.1.0 py38h0a891b7_0 conda-forge matplotlib-base 3.5.1 py38hf4fb855_0 conda-forge matplotlib-inline 0.1.3 pyhd8ed1ab_0 conda-forge mistune 0.8.4 py38h497a2fe_1005 conda-forge msgpack-python 1.0.3 py38h1fd1430_0 conda-forge multidict 6.0.2 py38h497a2fe_0 conda-forge multipledispatch 0.6.0 py_0 conda-forge munkres 1.1.4 pyh9f0ad1d_0 conda-forge mypy_extensions 0.4.3 py38h578d9bd_4 conda-forge nbclassic 0.3.5 pyhd8ed1ab_0 conda-forge nbclient 0.5.11 pyhd8ed1ab_0 conda-forge nbconvert 6.4.2 py38h578d9bd_0 conda-forge nbformat 5.1.3 pyhd8ed1ab_0 conda-forge ncurses 6.3 h9c3ff4c_0 conda-forge nest-asyncio 1.5.4 pyhd8ed1ab_0 conda-forge netcdf4 1.5.8 nompi_py38h2823cc8_101 conda-forge notebook 6.4.8 pyha770c72_0 conda-forge numba 0.53.1 py38h8b71fd7_1 conda-forge numcodecs 0.9.1 py38h709712a_2 conda-forge numpy 1.22.2 py38h6ae9a64_0 conda-forge openjpeg 2.4.0 hb52868f_1 conda-forge openssl 1.1.1l h7f98852_0 conda-forge orc 1.7.3 h1be678f_0 conda-forge packaging 21.3 pyhd8ed1ab_0 conda-forge pandas 1.4.1 py38h43a58ef_0 conda-forge pandoc 2.17.1.1 ha770c72_0 conda-forge pandocfilters 1.5.0 pyhd8ed1ab_0 conda-forge panel 0.12.6 pyhd8ed1ab_0 conda-forge param 1.12.0 pyh6c4a22f_0 conda-forge parquet-cpp 1.5.1 2 conda-forge parso 0.8.3 pyhd8ed1ab_0 conda-forge partd 1.2.0 pyhd8ed1ab_0 conda-forge pathspec 0.9.0 pyhd8ed1ab_0 conda-forge pexpect 4.8.0 pyh9f0ad1d_2 conda-forge pickleshare 0.7.5 py_1003 conda-forge pillow 9.0.1 py38h0ee0e06_2 conda-forge pip 22.0.3 pyhd8ed1ab_0 conda-forge platformdirs 2.5.1 pyhd8ed1ab_0 conda-forge prometheus_client 0.13.1 pyhd8ed1ab_0 conda-forge prompt-toolkit 3.0.27 pyha770c72_0 conda-forge psutil 5.9.0 py38h497a2fe_0 conda-forge pthread-stubs 0.4 h36c2ea0_1001 conda-forge ptyprocess 0.7.0 pyhd3deb0d_0 conda-forge pure_eval 0.2.2 pyhd8ed1ab_0 conda-forge pyarrow 7.0.0 py38he7e5f7d_2_cpu conda-forge pycparser 2.21 pyhd8ed1ab_0 conda-forge pyct 0.4.6 py_0 conda-forge pyct-core 0.4.6 py_0 conda-forge pygments 2.11.2 pyhd8ed1ab_0 conda-forge pyopenssl 22.0.0 pyhd8ed1ab_0 conda-forge pyparsing 3.0.7 pyhd8ed1ab_0 conda-forge pyrsistent 0.18.1 py38h497a2fe_0 conda-forge pysocks 1.7.1 py38h578d9bd_4 conda-forge python 3.8.12 ha38a3c6_3_cpython conda-forge python-dateutil 2.8.2 pyhd8ed1ab_0 conda-forge python_abi 3.8 2_cp38 conda-forge pytz 2021.3 pyhd8ed1ab_0 conda-forge pyviz_comms 2.1.0 pyhd8ed1ab_0 conda-forge pyyaml 6.0 py38h497a2fe_3 conda-forge pyzmq 22.3.0 py38h2035c66_1 conda-forge re2 2022.02.01 h9c3ff4c_0 conda-forge readline 8.1 h46c0cb4_0 conda-forge requests 2.27.1 pyhd8ed1ab_0 conda-forge s2n 1.0.10 h9b69904_0 conda-forge s3fs 2022.1.0 pyhd8ed1ab_0 conda-forge scipy 1.8.0 py38h56a6a73_1 conda-forge send2trash 1.8.0 pyhd8ed1ab_0 conda-forge setuptools 59.8.0 py38h578d9bd_0 conda-forge six 1.16.0 pyh6c4a22f_0 conda-forge snappy 1.1.8 he1b5a44_3 conda-forge sniffio 1.2.0 py38h578d9bd_2 conda-forge sortedcontainers 2.4.0 pyhd8ed1ab_0 conda-forge sqlite 3.37.0 h9cd32fc_0 conda-forge stack_data 0.2.0 pyhd8ed1ab_0 conda-forge tblib 1.7.0 pyhd8ed1ab_0 conda-forge terminado 0.13.1 py38h578d9bd_0 conda-forge testpath 0.5.0 pyhd8ed1ab_0 conda-forge tk 8.6.12 h27826a3_0 conda-forge tomli 2.0.1 pyhd8ed1ab_0 conda-forge toolz 0.11.2 pyhd8ed1ab_0 conda-forge tornado 6.1 py38h497a2fe_2 conda-forge tqdm 4.62.3 pyhd8ed1ab_0 conda-forge traitlets 5.1.1 pyhd8ed1ab_0 conda-forge typed-ast 1.5.2 py38h497a2fe_0 conda-forge typing-extensions 4.1.1 hd8ed1ab_0 conda-forge typing_extensions 4.1.1 pyha770c72_0 conda-forge unicodedata2 14.0.0 py38h497a2fe_0 conda-forge urllib3 1.26.8 pyhd8ed1ab_1 conda-forge wcwidth 0.2.5 pyh9f0ad1d_2 conda-forge webencodings 0.5.1 py_1 conda-forge websocket-client 1.2.3 pyhd8ed1ab_0 conda-forge wheel 0.37.1 pyhd8ed1ab_0 conda-forge wrapt 1.13.3 py38h497a2fe_1 conda-forge xarray 0.21.1 pyhd8ed1ab_0 conda-forge xorg-libxau 1.0.9 h7f98852_0 conda-forge xorg-libxdmcp 1.1.3 h7f98852_0 conda-forge xz 5.2.5 h516909a_1 conda-forge yaml 0.2.5 h7f98852_2 conda-forge yarl 1.7.2 py38h497a2fe_1 conda-forge zarr 2.11.0 pyhd8ed1ab_0 conda-forge zeromq 4.3.4 h9c3ff4c_1 conda-forge zict 2.0.0 py_0 conda-forge zipp 3.7.0 pyhd8ed1ab_1 conda-forge zlib 1.2.11 h36c2ea0_1013 conda-forge zstd 1.5.2 ha95c52a_0 conda-forge ```

Error log

``` python --------------------------------------------------------------------------- FileNotFoundError Traceback (most recent call last) Input In [3], in ----> 1 flights = airline_flights.to_dask().persist() 2 print(type(flights)) 3 flights.head() File ~/miniconda3/envs/tmp/lib/python3.8/site-packages/intake_parquet/source.py:99, in ParquetSource.to_dask(self) 98 def to_dask(self): ---> 99 self._load_metadata() 100 return self._df File ~/miniconda3/envs/tmp/lib/python3.8/site-packages/intake/source/base.py:236, in DataSourceBase._load_metadata(self) 234 """load metadata only if needed""" 235 if self._schema is None: --> 236 self._schema = self._get_schema() 237 self.dtype = self._schema.dtype 238 self.shape = self._schema.shape File ~/miniconda3/envs/tmp/lib/python3.8/site-packages/intake_parquet/source.py:60, in ParquetSource._get_schema(self) 58 def _get_schema(self): 59 if self._df is None: ---> 60 self._df = self._to_dask() 61 dtypes = {k: str(v) for k, v in self._df._meta.dtypes.items()} 62 self._schema = base.Schema(datashape=None, 63 dtype=dtypes, 64 shape=(None, len(self._df.columns)), 65 npartitions=self._df.npartitions, 66 extra_metadata={}) File ~/miniconda3/envs/tmp/lib/python3.8/site-packages/intake_parquet/source.py:108, in ParquetSource._to_dask(self) 106 import dask.dataframe as dd 107 urlpath = self._get_cache(self._urlpath)[0] --> 108 self._df = dd.read_parquet(urlpath, 109 storage_options=self._storage_options, **self._kwargs) 110 self._load_metadata() 111 return self._df File ~/miniconda3/envs/tmp/lib/python3.8/site-packages/dask/dataframe/io/parquet/core.py:400, in read_parquet(path, columns, filters, categories, index, storage_options, engine, gather_statistics, ignore_metadata_file, metadata_task_size, split_row_groups, chunksize, aggregate_files, **kwargs) 397 raise ValueError("read_parquet options require gather_statistics=True") 398 gather_statistics = True --> 400 read_metadata_result = engine.read_metadata( 401 fs, 402 paths, 403 categories=categories, 404 index=index, 405 gather_statistics=gather_statistics, 406 filters=filters, 407 split_row_groups=split_row_groups, 408 chunksize=chunksize, 409 aggregate_files=aggregate_files, 410 ignore_metadata_file=ignore_metadata_file, 411 metadata_task_size=metadata_task_size, 412 **kwargs, 413 ) 415 # In the future, we may want to give the engine the 416 # option to return a dedicated element for `common_kwargs`. 417 # However, to avoid breaking the API, we just embed this 418 # data in the first element of `parts` for now. 419 # The logic below is inteded to handle backward and forward 420 # compatibility with a user-defined engine. 421 meta, statistics, parts, index = read_metadata_result[:4] File ~/miniconda3/envs/tmp/lib/python3.8/site-packages/dask/dataframe/io/parquet/fastparquet.py:862, in FastParquetEngine.read_metadata(cls, fs, paths, categories, index, gather_statistics, filters, split_row_groups, chunksize, aggregate_files, ignore_metadata_file, metadata_task_size, **kwargs) 844 @classmethod 845 def read_metadata( 846 cls, (...) 860 861 # Stage 1: Collect general dataset information --> 862 dataset_info = cls._collect_dataset_info( 863 paths, 864 fs, 865 categories, 866 index, 867 gather_statistics, 868 filters, 869 split_row_groups, 870 chunksize, 871 aggregate_files, 872 ignore_metadata_file, 873 metadata_task_size, 874 kwargs, 875 ) 877 # Stage 2: Generate output `meta` 878 meta = cls._create_dd_meta(dataset_info) File ~/miniconda3/envs/tmp/lib/python3.8/site-packages/dask/dataframe/io/parquet/fastparquet.py:473, in FastParquetEngine._collect_dataset_info(cls, paths, fs, categories, index, gather_statistics, filters, split_row_groups, chunksize, aggregate_files, ignore_metadata_file, metadata_task_size, kwargs) 469 else: 470 # Rely on metadata for 0th file. 471 # Will need to pass a list of paths to read_partition 472 scheme = get_file_scheme(fns) --> 473 pf = ParquetFile( 474 paths[:1], open_with=fs.open, root=base, **dataset_kwargs 475 ) 476 pf.file_scheme = scheme 477 pf.cats = paths_to_cats(fns, scheme) File ~/miniconda3/envs/tmp/lib/python3.8/site-packages/fastparquet/api.py:113, in ParquetFile.__init__(self, fn, verify, open_with, root, sep, fs, pandas_nulls) 111 fs = getattr(open_with, "__self__", None) 112 if isinstance(fn, (tuple, list)): --> 113 basepath, fmd = metadata_from_many(fn, verify_schema=verify, 114 open_with=open_with, root=root, 115 fs=fs) 116 self.fn = join_path(basepath, '_metadata') if basepath \ 117 else '_metadata' 118 self.fmd = fmd File ~/miniconda3/envs/tmp/lib/python3.8/site-packages/fastparquet/util.py:179, in metadata_from_many(file_list, verify_schema, open_with, root, fs) 176 elif all(not isinstance(pf, api.ParquetFile) for pf in file_list): 178 if verify_schema or fs is None or len(file_list) < 3: --> 179 pfs = [api.ParquetFile(fn, open_with=open_with) for fn in file_list] 180 else: 181 # activate new code path here 182 f0 = file_list[0] File ~/miniconda3/envs/tmp/lib/python3.8/site-packages/fastparquet/util.py:179, in (.0) 176 elif all(not isinstance(pf, api.ParquetFile) for pf in file_list): 178 if verify_schema or fs is None or len(file_list) < 3: --> 179 pfs = [api.ParquetFile(fn, open_with=open_with) for fn in file_list] 180 else: 181 # activate new code path here 182 f0 = file_list[0] File ~/miniconda3/envs/tmp/lib/python3.8/site-packages/fastparquet/api.py:165, in ParquetFile.__init__(self, fn, verify, open_with, root, sep, fs, pandas_nulls) 163 self.fs = fs 164 else: --> 165 raise FileNotFoundError 166 self.open = open_with 167 self._statistics = None FileNotFoundError: ```

Could see airline_flights.cache_dirs outputs /home/shh/.intake/cache for me. When I deleted the folder it gave an error message ImportError: IProgress not found. Please update jupyter and ipywidgets. See https://ipywidgets.readthedocs.io/en/stable/user_install.html and after installing ipywidgets and removing the .intake folder again I could get this example to work.

@michaelaye can you try to rename your .intake cache folder and see if you also can get FileNotFoundError?

michaelaye commented 2 years ago

yes, I do. But I think by only renaming the cache folder you corrupt the intake data management system, because the persisted folder is still there, with partial records that need to access the cache folder whenever required. So when you only rename the cache folder, you have created an impossible state for intake. If you had renamed the whole .intake folder, things would have worked, I just confirmed that.

hoxbro commented 2 years ago

I agree. I was not very clear in my last post. If you have time can you try the following steps: 1) Try to rename/delete the .intake folder and uninstall ipywidgets and IProgress for the tmp environment. 2) Run the notebook. For me this create the .intake folder but will ImportError and not download the data. 3) Install ipywidgets. 4) Run the notebook again. This time I get a FileNotFoundError, because the data was not downloaded but all the folder was created.

To get this to work I have to delete the .intake folder and run the folder the notebook again.

michaelaye commented 2 years ago

Above mamba command doesn't install ipywidgets, so I will only deal with iprogress.

I confirm your given scenario to fail in the notebook (it doesn't fail in ipython console as it doesn't use progress bars from the library you uninstalled).

Question: Why would you uninstall iprogress, a notebook-supporting progress library when you want to work in the notebook?

What you have identified though are two bugs that are worth reporting:

The conda tqdm package should set a dependency on iprogress as it obviously crashes when it's silently removed. That silent removal is only possible because mamba/conda can uninstall it without uninstalling tqdm (the progress bar package)
the intake system does not clean up the file handles after the download process was interrupted by the missing iprogress package. Definitely worth reporting to prevent others from running into this edge case (although only caused by uninstalling iprogress. Again, why would you do that?)

This is as much time I can invest into this. Please create 2 issues at the respective github repositories.

hoxbro commented 2 years ago

I uninstalled iprogress because it was explicitly installed when creating the environment. iprogress is no longer supported and from what I can see it has been replaced by something similar in ipywidgets. I can only get the download to work with ipywidgets and not with iprogress. I don't understand why yours work with iprogress and mine doesn't.

I will file some bug reports later, so hopefully, new users (and me...) will be able to run the example without all these problems.

Thank you for helping me with finding the root of the problem, I really appreciate it!

michaelaye commented 2 years ago

Ah, i even didn't see that deliberate IProgress install in the mamba command. I

redid the test without it, then
- tqdm again fails (which is a bug, because if it needs something like that to run, it should add it to the conda-forge package dependency),
then installed ipywidgets only this time, and that seems to cover whatever tqdm needs to properly run.
- Yes, had to remove ~/.intake again, due to 2nd bug with intake's file management.

jbednar commented 2 years ago

Just to clarify, it sounds like there are some upstream issues to report, but it's ok that this issue was auto-closed when I merged #693? If so, fine, but if there are remaining issues above with hvplot for us to address, please reopen this issue and summarize what we need to do in hvplot. Thanks!

hoxbro commented 2 years ago

For references the problem with intake should be solved with https://github.com/intake/intake/pull/655

MarcSkovMadsen commented 2 years ago

My input regarding package dependencies. I think we should minimize this. Many users who could benefit from hvPlot etc. would not know about intake, parquet, sf3s etc. they so suddenly have to relate to 4 new packages instead of one.

make the introduction and first half of the tutorial simple and something familiar to most users.

holoviz / hvplot