Closed BlueskyFR closed 1 year ago
A solution would be to have a post-install hook that runs light-the-torch
, but the installed torch
with it would be have to be frozen afterwards to ensure it is not replaced by a subsequent poetry install
Duplicate #2145 -- this is in general pretty out of scope for Poetry as things currently exist, but possible with a plugin. Painful, but possible, and we can gradually introduce hooks to make this sort of thing easier.
Keep in mind that this is, in fact, highly specific to PyTorch -- there is no standard convention for distributing wheels built against different ML APIs. PyTorch does it one way, and other ML packages do it in other ways. A standard for describing and reasoning about wheel compatibility is needed for support in Poetry beyond a package-specific plugin, as the +cu111
et al. convention is just that -- adhoc (mis)use of local versions that existing tools interact with in sometimes unexpected (to those not familiar with Python packaging) ways.
In order for broad support in the ecosystem (including natively in Poetry) to happen, standardization of ML APIs/ABIs is necessary as part of the wheel spec (or a successor).
I totally agree that there is no standard for this, but the problem is still present so some help could be given on Poetry's side through tools such as hooks and package freezing I hope
If you're willing to freeze versions there's no problem -- add the correct pytorch.org
package index and you'll be locked to one API version.
Per #6409 performance leaves something to be desired as currently we emulate pip (+ the new resolver)'s behavior of checking every index exhaustively. However, you can do what you want today.
If you are talking about install-time selection of the proper variant, that is a full duplicate of the issue I linked. The best we'll be able to do until such time that we standardize markers in the ecosystem is adding some sort of hooks for custom markers -- but there's a lot of work necessary on the Plugin API before we can even think about such hooks.
Downloading 50 GB of package is not an option for me sadly, so this leaves me with no solution I guess
It's very unclear what you want, I think. Are you asking for some way to add packages to a Poetry environment using poetry install
that bypasses the normal resolution process? Until PyTorch indexes implement PEP 658 (and we gain support), we will have to download wheels for every platform as a result of how Python packaging works at a fundamental level.
Basically, if you don't want to solve ahead of time and have a universal poetry install
that works everywhere, Poetry may not be the right tool for you.
If you're willing to accept solving ahead of time requiring downloading PyTorch wheels, poetry export
+ ltt
may just work for you -- many projects (including poetry-core and certbot) make use of Poetry for management of a requirements.txt
list.
@BlueskyFR I would try addressing this issue with PyTorch team, since it's them doing non-standard things. I don't like the idea of Poetry, which is based on widely accepted standards, having to adapt to non-standard ways. The way I see it, they could have a simple wheel on PyPI that would provide CLI for setting up a proper environment.
@neersighted sorry for being unclear. What I want is the following:
poetry install
on the cloned repo, which doesn't have torch
in its dependenciespoetry run ltt install torch
which installs the latest torch
, compatible with my local backendpoetry add X
Is it possible?
In the same spirit, what if I want to install a custom built PyTorch version?
You're really asking for a feature where you can inject 'fake' packages into Poetry's resolution, so that Poetry considers them satisfied and solved for.
I'd create a new feature request issue for that -- the basic idea is that you would specify something like:
[tool.poetry]
dependencies-external = ["pytorch"]
And Poetry would consider pytorch: *
to be provided and act like it was locked/installed, while not in fact locking/installing it at all, and just trusting you, the end user, to install it correctly (e.g. using ltt
) so your code can run.
Please note the above design is ad-hoc -- what the final design would look like, and if this would be accepted by the project at all would have to be hammered out on the FR issue you create, and/or on the PR defining the implementation.
In the same spirit, what if I want to install a custom built PyTorch version?
You can do this today with URL dependencies and markers (but, as markers do not include any facility to discriminate based on ML API, this doesn't solve anything you couldn't do already with the pytorch indexes).
Thanks for the feedback.
I don't think I have the time to write FR and follow them at the moment, as this is likely to take weeks and I am looking for a quick solution.
So to wrap up, ltt
is not compatible with poetry in its current state right?
Poetry is not designed to interoperate with other tools that manipulate packages in its dependency tree, no. Even if we add the feature I described above, it will always be a best-effort/"it happens to work" sort of thing. That is to say, you're taking a lot into your own hands, and if Poetry's incomplete solution ignoring a package breaks when combined with LTT's, that's on you to solve, since it's not reasonably a problem with either Poetry or LTT.
That's right, but I am just disappointed by the fact that no way to manage dependencies in a PyTorch project š¢
Sorry to hear that -- Poetry works fine for users who are able to ensure a consistent ML API situation across all their install targets. For Poetry to 'just work' across APIs and not require compromises like the proposed feature above, this is a topic for the PyPA, discuss.python.org, and a PEP defining ML APIs as a new wheel tag.
Is the issue with the secondary download url being adressed? That would be the beginning of a solution
That's purely cosmetic -- it's a consequence of how additional sources are designed, and if you run pip in verbose mode with --extra-index-url
you will see it does the same thing (not setting secondary
or default
duplicates pip --extra-index-url
, and default
duplicates --extra-index-url
; secondary
is still unconditionally searched and is a Poetry invention).
https://github.com/python-poetry/poetry/pull/5984#issuecomment-1237245571 is a proposal to solve this by breaking the 'purely pip-like' semantics of non-PyPI sources.
Ok
Why does Poetry download the wheel file 2 times when specifying the { url = "XXX" }
?
It is downloaded once at the resolution and a second time to "upgrade" it by downloading the exact same file again:
If the wheel wasn't installed by Poetry, it may be missing a PEP 610 marker, aka direct_url.json
. That is to say, Poetry will only consider it the same torch version and not reinstall it if the marker exists and matches the URL that Poetry was configured with.
This is getting fairly off topic and turning into more of a support discussion (and I think the original issue was more of question than anything actionable anyway) -- I'm migrating this to Discussions as such.
Poetry version: 1.2.0
Python version: 3.10.6
OS version and name: Arch Linux
[X] I have searched the issues of this repo and believe that this is not a duplicate.
[X] I have consulted the FAQ and blog for any relevant entries or release notes.
Issue
Hi!
I do deep learning and am currently trying to switch to poetry for better dependencies management š I immediatly encountered a problem when trying to install PyTorch:
In order to solve the first problem (even if an official support from Poetry would be appreciated), I went with the famous light-the-torch, that automatically installs the right PyTorch version depending on the detected backend. The problem with this is that
torch
is not added topyproject.toml
afterwards, so if I do a subsequentpoetry install XXX
, thetorch
package has a lot of chances of being replaced by thetorch
from pip.Also, specifying the PyTorch url in the
.toml
is not a solution again since it is dependent on the local backend. In a PyTorch project, the common factor is the PyTorch package version, not the backend on which it runs as the latter just ensures the project can run on a variety of configurations.This means that Poetry is not currently compatible with PyTorch. I don't think that saying that PyTorch is a special case is a good idea since this release design just exposes the need to compile a package for each particular software stack, so it is really a general problem which must be solved IMO.
I'll be happy to discuss below of how Poetry must adapt to this design!