prefix-dev / pixi

Package management made easy
https://pixi.sh
BSD 3-Clause "New" or "Revised" License
3.33k stars 187 forks source link

Install dependencies of pypi package from conda preferentially #532

Open olivier-lacroix opened 11 months ago

olivier-lacroix commented 11 months ago

Problem description

Hello there,

Being able to install packages from pypi is great, as not everything is available on conda-forge. However, more often than not, the dependencies of these packages are.

Currently, if a pypi dependency is added, its dependencies will be pulled from pypi as well. Which is similar to what the typical mamba + pip workflow would do.

It would be great if pixi, solving for both conda & pypi, was able to install dependencies of pypi packages preferentially from conda channels.

Cheers,

Olivier

ruben-arts commented 11 months ago

He @olivier-lacroix, we would love this feature as well, we're working towards this and we think it is technically feasible. But as of right now we can't provide a roadmap for that.

TTTPOB commented 11 months ago

I want this features as well but I think there is some difficulties in mapping pypi packages to conda ones, because they may have different naming conventions. I can expect manually maintain this such big amount of package name maps. Could there be some automatic ways? It seems there is no metadata in conda packages referred to the equivalent pypi package.

ruben-arts commented 11 months ago

It might be a compute and network intense but you could download all conda packages add test which packages are available in that conda package. As a conda package might contain more then one PyPI package.

We currently have some code that depends on a map that is automatically created by cf-graph-countyfair, from the conda forge logic. This also only works with conda-forge and not the other channels.

@baszalmstra was already thinking of ways of doing this. Maybe he could elaborate some more.

olivier-lacroix commented 11 months ago

@TTTPOB agreed an automagical mapping would be best, but to be honest, I would be happy to have this feature with some manual steps required.

Indeed, in my experience, names are very often the same, and existing mappings (e.g. from grayskull) cover a fair few discrepancies already.

For the cases not covered by the above, having a way to manually indicate the mapping / force the source would be completely acceptable. This is what conda-lock does, in its [tool.conda-lock.dependencies] section, which allows to:

This overrides conda-lock general behaviour of dependencies being pypi package names tentatively resolved as conda names (after translation via a mapping of known discrepancies).

I have found this approach by exception worked pretty well in practice.

olivier-lacroix commented 7 months ago

Logging my use case for this "mapping to conda packages", from #999

I think this situation is pretty common for data-science and ML use cases

So basically, to make things work, we pull as much from conda as possible, and the rest from pypi (internal packages and packages unavailable in conda-forge). We currently do this with conda-lock, which does that mapping for us (but for a limited set of packages only available in pypi that are excluded from the mapping).

Ideally,

An alternative would be to go full-on conda, but that would require us to:

olivier-lacroix commented 7 months ago

I wouldn't mind taking a stab at this at some point. Keen to hear how hard / involved you think this would be @ruben-arts ? and any pointer/guidance/thoughts you may have on how to modify the dependency resolution / lock file generation process to perform that pypi->conda translation .

ruben-arts commented 7 months ago

We're working on a new map implementation (mapping tool, pixi PR), without a front-end for the user but this is probably something we should also start thinking about. If you have a design proposal for pixi please share it but the actual work would be nicer when we have a better initial map which we can use in conjunction with this feature.

abkfenris commented 7 months ago

Maybe it could be combined with dependency version overrides: #620

[dependencies.overrides]
numpy = "<2" # Constrains version but installed from wherever the parent is from
cli-helpers = { conda = "cli_helpers"}  # Maps to a specific name in Conda channel
xarray = { version = ">2024.0.0", conda = true } # Install from Conda and constrain version
maresb commented 3 months ago

I ended up creating a duplicate of this as https://github.com/prefix-dev/pixi/issues/1646.

Copied here to keep everything in one place:

Original Discord discussion

Currently pixi interprets project.dependencies as tool.pixi.pypi-dependencies. (At least I think so, but I'm not sure of the details. Does it also read from project.dependencies when using pixi.toml?) In order to install these dependencies instead with conda, they currently need to be duplicated to tool.pixi.dependencies.

My request is to add pixi.toml settings that accomplish the following:

  1. Have a setting so that everything in project.dependencies is by default interpreted as a conda-forge dependency instead of a PyPI dependency
  2. Be able to make an exception for individual packages, i.e. everything under project.dependencies should be a conda-forge dependency except for my-special-package-1 and my-special-package-2.
  3. Be able to override the default pypi→conda-forge mapping in case it's wrong

Advanced considerations: