data-engineering-collective / plateau

Flat files, flat land.
MIT License
23 stars 8 forks source link

CI Fixes #152

Closed IzerOnadimQC closed 5 months ago

IzerOnadimQC commented 5 months ago

CI failures are caused by the following package incompatibilities.

  1. https://github.com/dask/dask/issues/11038 Some versions of dask are incompatible with python 3.11.9. We were accidentally upgrading to this python version due to a --no-py-pin I added that we no longer need. Removing this fixes the issue, but this is not a true fix - perhaps we should pin dask to avoid the problematic versions (I don't know which versions are affected, only that the fix went into 2024.4.1, and has been back-ported to 2024.2.1).
  2. Pyarrow uses pandas.core.internals.DatetimeTZBlock, which will be removed in the next major release, and is already gone from the nightly builds. This is already handled in newer versions of pyarrow by checking the pandas API version, but causes failures for older versions of pyarrow. Perhaps this could be fixed by patching the repodata for pyarrow, but I'm uncertain whether this can be done for an optional dependency like pandas. For now, I've fixed this by testing the pyarrow and pandas nightly builds together, but an actual solution might be to drop support for older version of pyarrow when pandas 3 is released.
xhochy commented 5 months ago

You can also repodata patch optional dependencies via constraints in conda-forge.