conda / conda-lock

Lightweight lockfile for conda environments
https://conda.github.io/conda-lock/
Other
457 stars 102 forks source link

feat: add pypi-to-conda-name overrides to pyproject parsing #549

Open tlambert03 opened 7 months ago

tlambert03 commented 7 months ago

Description

Closes #548 by adding support for a pypi-to-conda-name section in pyproject.toml

[tool.conda-lock.pypi-to-conda-name]
cupy-cuda11x = "cupy"

names present there will override mappings provided by grayskull

Note: in #548, I proposed accepting a string or list of strings (which I do think is a good idea), but the implications of accepting a list of strings breaks the current 1-to-1 mapping behavior between pypi and conda, so I think that's probably best handled in a followup PR. This PR only accepts single string

netlify[bot] commented 7 months ago

Deploy Preview for conda-lock ready!

Name Link
Latest commit 4d10fc9bd26b2ae544afd9c72276db43a2f5229d
Latest deploy log https://app.netlify.com/sites/conda-lock/deploys/65d8c7fd0a41100008f8b56d
Deploy Preview https://deploy-preview-549--conda-lock.netlify.app
Preview on mobile
Toggle QR Code...

QR Code

Use your smartphone camera to open QR code link.

To edit notification comments on pull requests, go to your Netlify site configuration.

tlambert03 commented 7 months ago

ready for a first pass @maresb

tlambert03 commented 7 months ago

i didn't yet add a command line option, but could also do that if you want to tell me the syntax you'd prefer. perhaps something like --pypi-to-conda=cupy-cuda11x:cupy? dunno... could get hairy :)

maresb commented 7 months ago

i didn't yet add a command line option

A CLI option doesn't seem to me like a good interface. If you have pip deps, then usually they are coming from a pyproject.toml, so it makes sense to do all the configuration there, right?

tlambert03 commented 7 months ago

yeah, totally fine with me to leave out the cli option

tlambert03 commented 7 months ago

seem to be some test failures... but I haven't been able to reproduce them locally (macos)

maresb commented 7 months ago

This looks pretty good.

I'm somewhat horrified by the way that lookup.py currently works. I cleaned up a few of the worst things a while back, but didn't go nearly far enough. If you feel similarly, then please have at it! :grin: (Stuff like trying to get rid of LOOKUP_OBJECT, adding an __init__() to _LookupLoader, etc.)

A lot of this feels a bit overly-fancy to me like ChainMap and Mapping. I personally prefer working with dicts, so I'd instead do something like

return {**self.local_mappings, **self.remote_mappings}

I think it's just less mental effort to understand concrete types. I won't request you to change it, but in case you do make further changes, it'd be nice to go towards that concrete/simple direction.

tlambert03 commented 7 months ago

I think it's just less mental effort to understand concrete types. I won't request you to change it, but in case you do make further changes, it'd be nice to go towards that concrete/simple direction.

happy to change it as you prefer! will have a look at other lookup.py cleanup too

maresb commented 7 months ago

Haha, I think at least some of the CI failures may be a timing issue. Look what just dropped:

https://github.com/pandas-dev/pandas/releases/tag/v2.1.3

There's no corresponding conda-forge package yet. Maybe some of the tests need pins.

tlambert03 commented 7 months ago

ha!

tlambert03 commented 7 months ago

made some slight adjustments to the module. got rid of the module-level global and removed a duplicated function in pyproject_toml.py. I could probably go father and remove the _LookupLoader altogether, using lru_cache on a function call rather than cached_property on the class. but didn't want to be overly aggressive in my first PR :joy:

mariusvniekerk commented 7 months ago

Might be worth adding in a test to ensure that this functionality doesn't regress

tlambert03 commented 7 months ago

Might be worth adding in a test to ensure that this functionality doesn't regress

added, thanks!

maresb commented 7 months ago

Thanks @tlambert03 for adding the tests! Unfortunately they don't seem to be passing yet.

tlambert03 commented 7 months ago

thanks. yeah I saw that ... i think there's a fixture/state leakage somewhere (perhaps in the resolver). It works in isolation. need to figure out what other test it's conflicting iwth

maresb commented 7 months ago

Ah, ya, no pressure. It's always such a pain when tests run locally but not in CI. Good luck and thanks for all your work!

tlambert03 commented 7 months ago

figured it out. hopefully should pass now 🤞

maresb commented 7 months ago

At first glance this looks really great! Unfortunately I'm really slammed with work at the moment, so it might be a bit before I can give this a proper look.

tlambert03 commented 7 months ago

totally fine. no urgency on my end.
I might play a bit more with cleaning up lookup.py a bit more or implementing a {pypi_name -> list[conda_name]} interface like the originally proposed schema. but I'll leave this PR in a "mergable" state and likely do those elsewhere. anway, take your time :)

tlambert03 commented 7 months ago

Also, I'd prefer to leave out the MappingEntry class until the next PR when we actually implement it. (I don't like having the unused code laying around.)

not sure I'm following this comment. That class is already there. Are you proposing we remove it and re-add it later?

edit: with the removal of the conda_forge key, MappingEntry isn't really necessary at all. It could be replaced with a single dict mapping pypi-name to conda-name(s). The main question is how you want to handle the reverse lookup. If a pypi name can be converted to multiple conda packages, that makes the mapping non invertible

maresb commented 7 months ago

not sure I'm following this comment. That class is already there. Are you proposing we remove it and re-add it later?

Ah, my bad, I was confused. I thought you added it since it matched with one of your initial proposals. Also this was reinforced by observing that it seemed to be extraneous. But now I remember that it was originally the schema for an item in the mapping.

As a sort of documentation to make this clear, it would be nice to replace

lookup = {canonicalize_name(k): v for k, v in lookup.items()}
for v in lookup.values():
    v["pypi_name"] = canonicalize_name(v["pypi_name"])
return lookup

with something like:

result: Dict[NormalizedName, MappingEntry] = {}
for k, v in lookup.items():
    assert k == v["pypi_name"]
    pypi_name = canonicalize_name(k)
    assert pypi_name == k
    conda_name = v["conda_name"]
    assert pypi_name not in result
    result[pypi_name] = conda_name
return result

The main question is how you want to handle the reverse lookup. If a pypi name can be converted to multiple conda packages, that makes the mapping non invertible

Good question. The whole mapping story is very messy. But currently there is only instance of this happening:

from collections import defaultdict

from conda_lock.lookup import _lookup_loader

pypi_names = defaultdict(set)
for k, v in _lookup_loader.remote_mappings.items():
    pypi_names[v["conda_name"]].add(k)
print({k: v for k, v in pypi_names.items() if len(v) > 1})
# {'pyqt': {'pyqt4', 'pyqt5'}}

My suggestion would be to emit a warning and then take a guess. The user could always define a local mapping to break the tie, if local mappings were given priority. :wink:

anibali commented 4 months ago

Is there anything I can do to help this along? I would really appreciate the ability to disable the default mappings and just manually specify them from pyproject.toml (I find this a lot more convenient than the recent solution of having a separate mapping file provided via a command line argument). I was going to write my own PR, but stumbled across this in my preliminary search.

My initial thought was something along the lines of:

[tool.conda-lock.dependencies]
pytorch-scatter = {pypi_name = "torch-scatter"}

It would also offer a neat way of being explicit about when the PyPI version of a package is preferred. For example, say I want to install opencv-python from PyPI instead of the conda-forge package:

[tool.conda-lock.dependencies]
opencv = {source = "pypi", pypi_name = "opencv-python"}

I really enjoy using conda-lock, and would love to see the PyPI <-> Conda mapping handling improved by removing the need for another external file.

tlambert03 commented 4 months ago

i think the core functionality of adding [tool.conda-lock.pypi-to-conda-name] was all set to go, then we got sidetracked on cleaning up the existing module patterns. I'll resolve conflicts here so it can be merged, but probably don't have time to keep cleaning up the previous patterns.

tlambert03 commented 4 months ago

conflicts resolved. @maresb, looks like the conflicts, which resulted from #588, were actually already fixed in here. So, would it be ok with you to merge this and work on cleaning up the legacy lookup.py in some other PR? i think this is getting caught up in scope creep.

maresb commented 4 months ago

@tlambert03, sorry for letting this slip. Do you understand why the tests are failing though?