ContinuumIO / anaconda-package-data

Conda package download data
Creative Commons Attribution 4.0 International
98 stars 36 forks source link

Pandas/pyArrow/read_parquet error #47

Closed phwuil closed 11 months ago

phwuil commented 11 months ago

As requested by @sophiamyang , I pass on an issue I opened for condastats since this package depends on the data pipeline in this very repo :

Description

Unable to use condastats.cli.overall (internal error on pandas->pyArrow)

    dataconda = condastats.cli.overall([conda_module], monthly=True)
                ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "[...]/lib/python3.11/site-packages/condastats/cli.py", line 62, in overall
    df = dd.read_parquet(
         ^^^^^^^^^^^^^^^^
  File "[...]/python3.11/site-packages/dask/backends.py", line 138, in wrapper
    raise type(e)(
ValueError: An error occurred while calling the read_parquet method registered to the pandas backend.
Original Message: ArrowStringArray requires a PyArrow (chunked) array of string type
phwuil commented 11 months ago

I found it thanks to @nicrie : pandas<2.0.0 is required