conda-incubator / conda-store

Data science environments, for collaboration. ✨
https://conda.store
BSD 3-Clause "New" or "Revised" License
144 stars 46 forks source link

[BUG] - Environment builds locally, fails on conda-store #746

Closed rsignell closed 6 months ago

rsignell commented 8 months ago

Describe the bug

A conda environment that build locally on Linux fails to build on conda-store.

Expected behavior

conda environments which build locally on Linux should build on conda-store

How to Reproduce the problem?

Building this environment:

channels:
  - conda-forge
dependencies:
  - python=3.11
  - adios-db
  - adlfs
  - aiobotocore
  - birdy
  - black
  - bokeh
  - boto3
  - cdsapi
  - cf_xarray
  - cfgrib
  - cfunits
  - climpred
  - coiled
  - curl
  - dask-geopandas
  - dask
  - dask-gateway==2023.9.0
  - datashader
  - datacube
  - dask-geopandas
  - depfinder
  - distributed
  - earthaccess
  - earthdata
  - erddapy
  - fastparquet
  - flox
  - folium
  - gcsfs
  - gdal
  - gdptools
  - geocube
  - geogif
  - geolinks
  - geopandas
  - geopy
  - geoviews
  - graphviz
  - h5netcdf
  - h5py
  - hologridgen
  - htop
  - hvplot
  - imagecodecs
  - intake-geopandas
  - intake-parquet
  - intake-stac
  - intake-xarray
  - ipykernel
  - ipyleaflet
  - ipywidgets
  - isort
  - jinja2
  - jupyter_bokeh
  - jupyter-panel-proxy
  - jupyterlab_code_formatter
  - jupytext
  - jq
  - leafmap
  - lxml
  - lz4
  - mamba
  - metpy
  - nbgitpuller
  - nbstripout
  - nco
  - netcdf4 == 1.6.0
  - numba
  - numcodecs
  - odc-algo
  - odc-stac
  - openpyxl
  - osmnx
  - owslib
  - pandas == 1.5.3
  - panel
  - pangeo-forge-recipes
  - papermill
  - param
  - pip
  - pint-xarray
  - planetary-computer
  - pyarrow
  - pyepsg
  - pygeohydro
  - pydaymet
  - pynhd
  - py3dep
  - pygeoogc
  - pygeoutils
  - pynco
  - pyogrio
  - pyvista
  - async_retriever
  - pystac
  - pystac-client
  - python-graphviz
  - python-snappy
  - pyyaml
  - python-gist
  - rasterio
  - rechunker
  - requests
  - rich
  - rio-cogeo
  - rioxarray
  - scikit-image
  - s3fs
  - seawater
  - selenium
  - siphon
  - spatialpandas
  - stackstac
  - toolz
  - ujson
  - unzip
  - utide
  - vim
  - wgrib2
  - xagg
  - xarray-spatial
  - xarray_leaflet
  - xbitinfo-python
  - xesmf
  - xoak
  - xrviz
  - xskillscore
  - zip
  - zstandard
  - cdsdashboards-singleuser>=0.5.6
  - xmip
  - intake-esm
  - xarray-datatree
  - h5pyd
  - noaa-coops
  - kbatch
  - xstac
  - zarr
  - jupyter-book
  - ghp-import
  - jsonschema-with-format-nongpl
  - webcolors
  - xarrayutils
  - pip:
      - kerchunk @ git+https://github.com/fsspec/kerchunk@main
      - fsspec @ git+https://github.com/fsspec/filesystem_spec@master
      - opendrift @ git+https://github.com/OpenDrift/opendrift@master
  - ipykernel

fails with the output below.

Output

starting build of conda environment 2024-01-24 20:41:24.640627 UTC
Traceback (most recent call last):
  File "/opt/conda/envs/conda-store-server/lib/python3.10/site-packages/conda_lock/_vendor/poetry/puzzle/solver.py", line 233, in _solve
    result = resolve_version(
  File "/opt/conda/envs/conda-store-server/lib/python3.10/site-packages/conda_lock/_vendor/poetry/mixology/__init__.py", line 7, in resolve_version
    return solver.solve()
  File "/opt/conda/envs/conda-store-server/lib/python3.10/site-packages/conda_lock/_vendor/poetry/mixology/version_solver.py", line 83, in solve
    self._propagate(next)
  File "/opt/conda/envs/conda-store-server/lib/python3.10/site-packages/conda_lock/_vendor/poetry/mixology/version_solver.py", line 123, in _propagate
    root_cause = self._resolve_conflict(incompatibility)
  File "/opt/conda/envs/conda-store-server/lib/python3.10/site-packages/conda_lock/_vendor/poetry/mixology/version_solver.py", line 321, in _resolve_conflict
    raise SolveFailure(incompatibility)
conda_lock._vendor.poetry.mixology.failure.SolveFailure: Because opendrift (rev master) depends on adios_db (>=1.1) which doesn't match any versions, opendrift is forbidden.
So, because -dummy-package- depends on opendrift (rev master), version solving failed.

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/opt/conda/envs/conda-store-server/lib/python3.10/site-packages/conda_store_server/build.py", line 132, in build_conda_environment
    context = action.action_solve_lockfile(
  File "/opt/conda/envs/conda-store-server/lib/python3.10/site-packages/conda_store_server/action/base.py", line 31, in wrapper
    action_context.result = f(action_context, *args, **kwargs)
  File "/opt/conda/envs/conda-store-server/lib/python3.10/site-packages/conda_store_server/action/generate_lockfile.py", line 23, in action_solve_lockfile
    run_lock(
  File "/opt/conda/envs/conda-store-server/lib/python3.10/site-packages/conda_lock/conda_lock.py", line 1092, in run_lock
    make_lock_files(
  File "/opt/conda/envs/conda-store-server/lib/python3.10/site-packages/conda_lock/conda_lock.py", line 394, in make_lock_files
    fresh_lock_content = create_lockfile_from_spec(
  File "/opt/conda/envs/conda-store-server/lib/python3.10/site-packages/conda_lock/conda_lock.py", line 821, in create_lockfile_from_spec
    deps = _solve_for_arch(
  File "/opt/conda/envs/conda-store-server/lib/python3.10/site-packages/conda_lock/conda_lock.py", line 752, in _solve_for_arch
    pip_deps = solve_pypi(
  File "/opt/conda/envs/conda-store-server/lib/python3.10/site-packages/conda_lock/pypi_solver.py", line 351, in solve_pypi
    result = s.solve(use_latest=to_update)
  File "/opt/conda/envs/conda-store-server/lib/python3.10/site-packages/conda_lock/_vendor/poetry/puzzle/solver.py", line 65, in solve
    packages, depths = self._solve(use_latest=use_latest)
  File "/opt/conda/envs/conda-store-server/lib/python3.10/site-packages/conda_lock/_vendor/poetry/puzzle/solver.py", line 241, in _solve
    raise SolverProblemError(e)
conda_lock._vendor.poetry.puzzle.exceptions.SolverProblemError: Because opendrift (rev master) depends on adios_db (>=1.1) which doesn't match any versions, opendrift is forbidden.
So, because -dummy-package- depends on opendrift (rev master), version solving failed.

Versions and dependencies used.

conda 23.3.1
Linux (ubuntu 22.04.2 LTS)
Node.js v20.8.1

Anything else?

Note that adios-db is specified in the conda environment, yet somehow conda-store can't find it?

pavithraes commented 8 months ago

Thanks for opening an issue, Rich. :)

So, the issue seems to be in creating the lockfile and not the environment itself.

conda-store uses conda-lock to create the lockfile, and trying to do this locally fails with the same message for me.

To create a lockfile, make sure you have conda-lock installed, ```yaml conda install -c conda-forge conda-lock ``` Then run: ```yaml conda-lock -f ```

Moreover, the traceback suggests this is coming from Poetry. Note that conda-lock uses poetry to resolve pip dependencies and conflicts. I see that adios-db is not available on PyPI (but is required by opendrift in the pip section), which is probably why Poetry is raising this error.

For instance, the following spec also fails to create a lockfile with the same error:

channels:
  - conda-forge
dependencies:
  - python=3.11
  - ipykernel
  - adios_db
  - pip
  - pip:
      - opendrift @ git+https://github.com/OpenDrift/opendrift@master

Speaking for conda-store, failing early here is a good thing, IMO. We want to promise reproducibility+reliability, which we can't for this env even for local builds.

I don't understand the solving mechanism to know for why conda-lock is not using the adios_db available on conda-forge, maybe @jaimergp can help answer. :)

trallard commented 7 months ago

@jaimergp gentle ping here please 🙏

jaimergp commented 7 months ago

The source of confusion here might be that users might expect conda-store to use the regular conda env create -f invocation, while it actually uses conda-lock to generate a lockfile. So, in that regard, I'm inclined to say that this is an issue on conda-lock's side, not conda-store.

Should we provide a fallback to conda env create if conda-lock fails? That's a separate question.

Another issue to bring up here is, for the sake of reproducibility, how the environment pins to @master, and that's a moving target. So maybe it would be better to pin the pip requirements to specific git hashes or tags.

Also I'm not sure if conda-lock exposes the contents of the conda packages to pip. It feels like it's done in isolation, which is a departure from conda env's operations. conda env will solve and install the conda packages first, and then run pip on top of that environment. That's why pip can find adios-db but conda-lock does not. So maybe one workaround is to also include the adios-db component in pip:? Not sure about that, but worth a try.

ocefpaf commented 7 months ago

@rsignell I probably made a mistake in my tests where I got this env working with conda-lock at some point. Maybe I changed the env file. However, @jaimergp is correct, in that form conda-lock will invoke poetry and it will error out with:

conda_lock._vendor.poetry.repositories.exceptions.PackageNotFound: Package adios-db (1.1.1) not found.

That is a bug in conda-lock b/c the conda version of that package is installed and available but the poetry wrapper in conda-lock cannot pass that information to it. With that said, poetry is choking b/c adios-db is not available on PyPI, making the package that requires it uninstallable via pip. Here is what we can try:

trallard commented 7 months ago

Hey @ocefpaf 👋🏽

Agree the third option looks like the easiest/least friction path for now