conda / conda-lock

Lightweight lockfile for conda environments
https://conda.github.io/conda-lock/
Other
457 stars 102 forks source link

Some dependencies are renamed if conda-lock is used with a pyproject.toml file. #508

Open os-simopt opened 9 months ago

os-simopt commented 9 months ago

Checklist

What happened?

Using conda-lock v2.2.1, I encountered the following strange behaviour:

I am using a pyproject.toml file, where my dependencies are defined like this:

[project] dependencies = [ "yaml", "numpy", "pandas", "python-dateutil", "pytables", "xarray", "matplotlib", "tensorflow", "seaborn", ]

[tool.conda-lock] channels = [ 'conda-forge', ] platforms = [ 'linux-64' ]

I never encountered any problems, but now, with the seaborn package, I found that conda-lock renames seaborn to seaborn-split and complains that it can't find that package.

Using a dev.yml file everything works fine.

Conda Info

No response

Conda Config

No response

Conda list

No response

Additional Context

No response

mfisher87 commented 9 months ago

I dug around in the source and found that this is where the mapping comes from: https://raw.githubusercontent.com/regro/cf-graph-countyfair/master/mappings/pypi/grayskull_pypi_mapping.yaml

seaborn:
  conda_name: seaborn-split
  import_name: seaborn
  mapping_source: regro-bot
  pypi_name: seaborn

Conda package names can't be guaranteed to match pypi package names, so a mapping is needed. I don't know much more detail about this, but it looks like the entry for seaborn is wrong!

EDIT: Digging a bit deeper...

Here's the README for that data file: https://github.com/regro/cf-graph-countyfair/blob/master/mappings/pypi/README.md

The static mapping file linked from there contains the following (I think, correct, following the pattern for matplotlib-base) mapping for seaborn:

- pypi_name: seaborn
  import_name: seaborn
  conda_name: seaborn-base

I would expect this static mapping to be the primary source of truth, but perhaps it's being overwritten by regro-bot's mapping. In any case there is no seaborn-split feedstock, so I'd open an issue over on cf-scripts: https://github.com/regro/cf-scripts/issues

mfisher87 commented 9 months ago

Looks like this has been reported before, resolved, and then regressed. https://github.com/regro/cf-scripts/issues/1513

maresb commented 9 months ago

Thanks so much @mfisher87 for jumping in on this! You're spot-on. I wonder why this regressed.

maresb commented 9 months ago

For a little more context, the mapping uses some graph-theoretic algorithms in order to make an intelligent guess of what the best package is. It's actually really confusing, because the "mapping" is actually a 3-way relation between PyPI name, conda-forge name, and import name.

The mapping is regenerated every hour or so. From the logs we see that there are two candidates for Seaborn. Not sure why the mapping isn't picking this up:

needs seaborn <- provided_by: ['seaborn-split', 'seaborn-base'] : chosen seaborn-split

After cloning locally and blaming, I can see that the mapping for Seaborn has been broken a long time (as far back as the Git history goes). My guess is that adding the static mapping entry never fixed the problem.