ArgoCanada / argopandas

A pandas.DataFrame-based interface to Argo ocean float data
https://ArgoCanada.github.io/argopandas/
5 stars 1 forks source link

ValueError: Big-endian buffer not supported on little-endian compiler #10

Open gmaze opened 3 years ago

gmaze commented 3 years ago

I just installed argopandas with conda:

conda install --channel=conda-forge argopandas

and my first test fails like this:

import argopandas as argo
argo.prof.head(5).levels[['PRES', 'TEMP']]

returns the following error stack:

Downloading 5 files from 'https://data-argo.ifremer.fr/dac/aoml/13857/profiles'
Reading 5 files                                                       

---------------------------------------------------------------------------
ValueError                                Traceback (most recent call last)
<timed eval> in <module>

~/anaconda/envs/argopy-tests-py38free/lib/python3.8/site-packages/argopandas/index.py in levels(self)
    291         the files in this index.
    292         """
--> 293         return self.levels_()
    294 
    295     def levels_(self, vars: Union[None, str, Iterable[str]]=None) -> pd.DataFrame:

~/anaconda/envs/argopy-tests-py38free/lib/python3.8/site-packages/argopandas/index.py in levels_(self, vars)
    301             to select all possible variables.
    302         """
--> 303         return self._data_frame_along('levels', vars=vars)
    304 
    305     @property

~/anaconda/envs/argopy-tests-py38free/lib/python3.8/site-packages/argopandas/index.py in _data_frame_along(self, attr, vars)
     67 
     68         # combine them, adding a `file` index as a level in the multi-index
---> 69         return pd.concat(objs, keys=keys, names=["file"])
     70 
     71     @property

~/anaconda/envs/argopy-tests-py38free/lib/python3.8/site-packages/pandas/util/_decorators.py in wrapper(*args, **kwargs)
    309                     stacklevel=stacklevel,
    310                 )
--> 311             return func(*args, **kwargs)
    312 
    313         return wrapper

~/anaconda/envs/argopy-tests-py38free/lib/python3.8/site-packages/pandas/core/reshape/concat.py in concat(objs, axis, join, ignore_index, keys, levels, names, verify_integrity, sort, copy)
    305     )
    306 
--> 307     return op.get_result()
    308 
    309 

~/anaconda/envs/argopy-tests-py38free/lib/python3.8/site-packages/pandas/core/reshape/concat.py in get_result(self)
    530                 mgrs_indexers.append((obj._mgr, indexers))
    531 
--> 532             new_data = concatenate_managers(
    533                 mgrs_indexers, self.new_axes, concat_axis=self.bm_axis, copy=self.copy
    534             )

~/anaconda/envs/argopy-tests-py38free/lib/python3.8/site-packages/pandas/core/internals/concat.py in concatenate_managers(mgrs_indexers, axes, concat_axis, copy)
    224             fastpath = blk.values.dtype == values.dtype
    225         else:
--> 226             values = _concatenate_join_units(join_units, concat_axis, copy=copy)
    227             fastpath = False
    228 

~/anaconda/envs/argopy-tests-py38free/lib/python3.8/site-packages/pandas/core/internals/concat.py in _concatenate_join_units(join_units, concat_axis, copy)
    488     upcasted_na = _dtype_to_na_value(empty_dtype, has_none_blocks)
    489 
--> 490     to_concat = [
    491         ju.get_reindexed_values(empty_dtype=empty_dtype, upcasted_na=upcasted_na)
    492         for ju in join_units

~/anaconda/envs/argopy-tests-py38free/lib/python3.8/site-packages/pandas/core/internals/concat.py in <listcomp>(.0)
    489 
    490     to_concat = [
--> 491         ju.get_reindexed_values(empty_dtype=empty_dtype, upcasted_na=upcasted_na)
    492         for ju in join_units
    493     ]

~/anaconda/envs/argopy-tests-py38free/lib/python3.8/site-packages/pandas/core/internals/concat.py in get_reindexed_values(self, empty_dtype, upcasted_na)
    468         else:
    469             for ax, indexer in self.indexers.items():
--> 470                 values = algos.take_nd(values, indexer, axis=ax)
    471 
    472         return values

~/anaconda/envs/argopy-tests-py38free/lib/python3.8/site-packages/pandas/core/array_algos/take.py in take_nd(arr, indexer, axis, fill_value, allow_fill)
    106 
    107     arr = np.asarray(arr)
--> 108     return _take_nd_ndarray(arr, indexer, axis, fill_value, allow_fill)
    109 
    110 

~/anaconda/envs/argopy-tests-py38free/lib/python3.8/site-packages/pandas/core/array_algos/take.py in _take_nd_ndarray(arr, indexer, axis, fill_value, allow_fill)
    152         arr.ndim, arr.dtype, out.dtype, axis=axis, mask_info=mask_info
    153     )
--> 154     func(arr, indexer, out, fill_value)
    155 
    156     if flip_order:

pandas/_libs/algos_take_helper.pxi in pandas._libs.algos.take_2d_axis0_float32_float32()

ValueError: Big-endian buffer not supported on little-endian compiler

Here is a recap of my environment:

INSTALLED VERSIONS
------------------
python: 3.8.10 | packaged by conda-forge | (default, May 10 2021, 22:58:09) 
[Clang 11.1.0 ]
python-bits: 64
OS: Darwin
OS-release: 18.7.0
machine: x86_64
processor: i386
byteorder: little
LC_ALL: en_US.UTF-8
LANG: en_US.UTF-8
LOCALE: en_US.UTF-8
libhdf5: 1.12.1
libnetcdf: 4.8.1

xarray: 0.19.0
**pandas: 1.3.3**
numpy: 1.21.2
scipy: 1.7.1
fsspec: 2021.10.0
erddapy: 1.1.1
netCDF4: 1.5.7
pydap: None
h5netcdf: 0.11.0
h5py: 3.4.0
Nio: None
zarr: 2.10.0
cftime: 1.5.0
nc_time_axis: None
PseudoNetCDF: None
rasterio: None
cfgrib: 0.9.9.0
iris: None
bottleneck: 1.3.2
dask: 2021.09.1
distributed: 2021.09.1
matplotlib: 3.4.3
cartopy: 0.20.0
seaborn: 0.11.2
numbagg: None
gsw: 3.4.0
setuptools: 58.0.4
pip: 21.2.4
conda: None
pytest: 6.2.5
IPython: 7.27.0

Any idea where this could come from ? Thanks g

paleolimbot commented 3 years ago

Not off the top of my head! I have no idea how a big-endian buffer would get in there. I develop on Mac but not using conda, so perhaps there's some difference in the pandas and/or pyarrow packaging that affects things. I'll look into this more, but in the meantime you could try various combinations of the 'manual' version to see if anything more meaningful pops up.

import argopandas as argo
import pandas as pd

files = argo.prof.head(5)
paths = ['dac/' + f for f in files.file]
levels = [ds.levels for ds in argo.nc(paths)]
pd.concat(levels, keys=files.file, names=['file'])
gmaze commented 3 years ago

Thanks for your quick answer ! I've tested the above code, and indeed it worked 👍 So I guess this is coming from the .level method ...