SHI-Labs / NATTEN

Neighborhood Attention Extension. Bringing attention to a neighborhood near you!
https://shi-labs.com/natten/
Other
341 stars 25 forks source link

Installation with version management #159

Closed picagrad closed 1 month ago

picagrad commented 1 month ago

For a variety of reasons, I would love to be able to install natten "automatically" as part of the an environment management setting (e.g. using the environement to build docker images, publishing code as part of a package, etc.)

The current installation "best practice" appears to be to pip install from a specfic wheel based on hardware/torch version/etc.

Will a standard pip install work (albeit slowly)? Alternatively, can one simply include a requirement such as "natten<=0.14.5" and trust pip to eventually resolve the hardware/pytorch install/cuda dependencies and properly build the relevant version?

Thanks!

alihassanijr commented 1 month ago

Thanks for bringing this up. Three questions:

  1. Is this just to satisfy a dependency, or is natten actually used in the code you eventually intend to run?
  2. If the latter, is it only going to be run on CPU, or do you want CUDA support?
  3. If the latter, is it a legacy application that doesn't require absolute best speed, or runs on a GPU architecture older than Pascal?

Depending on these answers, you might be able to just do a pip install and not have to wait long for compilation, or you might have to wait a while. When it comes to wheels, we unfortunately can't ship the CUDA wheels by default, because there's no mechanism that guarantees pip can figure out the CTK version your environment can support, and even if it could, you might have more than once choice.

Because of that, and our dependency on python, cuda runtime and libtorch, we're forced to build an excessive number of wheels, and I realize that it might be a bit of a pain for users to handle. However, given that NATTEN now ships with 100s of kernel instances, I wouldn't recommend to users building it on their own, because even I have to use a big machine so I can build each binary in a reasonable amount of time.

As we go forward, we might consider JIT options, but unfortunately I can't predict when we might support them.

I realize there's a good chance I gave way more information than necessary, but I definitely can narrow down if I know your use case.

picagrad commented 1 month ago

Hi, thanks for your answer.

In essence what I am doing is preparing a model-zoo of sort for a specific type of application. As part of this models zoo I am currently in the process of adding at least one model that uses natten internally.

As such, in order to run this model I need to include natten in my environment. This isn't an issue for me in development, since I just manually select the correct version based on my machine (or use a docker container when possible/necessary).

However, the end-goal of the model-zoo is to publish it to PyPI to be used as a pacakge in a (albeit much less extensive) similar way to say the transformers package from huggingface or torchvision, etc.

Some of the other requirements of the package also require some custom installation procedures, and my workaround so far has been to include a custom script to help the user perform the custom installation if needed.

One thought I had, is that if you were able to help with some sort of "version resolver". For example, on your website that are several tables helping the user select the correct wheel based on their cuda/pytyorch/os versions. I was thinking that perhaps you could supply such a script/lookup table as part of the repo so that a user could run something like:

from natten_setup import version_resolver # or alternatively download a file from the repo
import subprocess
import torch

SHI_LAB_WHEELS = https://shi-labs.com/natten/wheels

torch_ver  = torch.__version__
cuda_ver = torch.cuda.__version__

wheel =  version_resolver(torch_ver = torch_ver, cuda_ver = cuda_ver) # eg 'natten==0.0.14.6+torch200cu118'
subprocess.check_call([sys.executable, '-m', 'pip', 'install' ,wheel,'-f', SHI_LAB_WHEELS])

Please don't worry about the small details of the install script, this example is done in python and perhaps isn't the best idea, and the final code might just be some bash script of a similar nature, the main idea is that it will select the version based on something provided by you, either in the repo, in the package, or in any other way that is reliable and will continue to be updated.

What do you think?

alihassanijr commented 1 month ago

I see -- thank you for sharing these details.

I agree that the dependency with wheels will likely make your life harder, because it's first a dependency on PyTorch then a dependency on NATTEN, itself depending on what your installed pytorch version is.

Honestly I really like your version resolver idea; it's probably the only way forward that can be near invisible to end users and not affect their experience. It's a little hacky, but so are pushing multi-arch binaries over pypi...

I think the only thing you'd need to mind though is cases where torch isn't already installed, or when either torch or natten are already installed.

If torch isn't already installed, and you import it in your setup script, it won't find torch (even if you do pip install torch my_package, pip will unpack and setup the two separately and installs them into your environment after they're all done, hence there is no guarantee of order here).

If torch is installed, then depending on the version, you might need to find certain older versions of NATTEN, because we can't support all previous torch versions due to various reasons, and try to keep each release limited to the most recent torch releases.

There's also the possibility of users already having NATTEN, especially an older version, and that might be a pain because we changed the API around a little bit in the past year, so some interfaces might not work the same way anymore, but that's less likely since we still keep a deprecated version of most of the old APIs.

But in general I like the idea, and I think it will definitely help your end users to figure out how to resolve dependencies behind the scenes. It might require some more extra work to handle different versions (i.e. using a dict to map torch versions to compatible NATTEN versions, maps for wheel urls, etc.), but I think it's a reasonable setup.

picagrad commented 1 month ago

Hi so I think some of these are issues that I am just going to have to trust the user to handle. My package will have a readme (and potentially also error handling) to help users deal with this.

Essentially, natten will not be in the requirements of my environement but torch will be. So any user using my package will have to already have some version of torch installed.

If they already have natten in the same environement, it'll be on them to make sure that they have an appropriate version to make everything work. if they don't then they will be (hopefully, assuming I write my code correctly haha) instructed to install natten, and be suggested to use the helper script provided by my package to do so.

What I was hoping you can help with is in providing some sort of data structure that relates the torch and cuda versions to the correct wheel to be installed, that will be maintained for changes if any occur. The reason I am not just writing one of those myself is because I suspect that you might want to change the wheels/urls/etc. over time and would prefer to have some reliable source that will (hopefully) be updated with any changes you make.

alihassanijr commented 1 month ago

I see; I think I understand your concern now.

I think it is definitely safe for the foreseeable future that we wouldn't change the wheel URL, and if that ever occurs we'll likely set up redirects to prevent it from being an issue for anyone.

As for the versioning, unless PyTorch changes their system we very likely won't change ours, and if there ever comes a day when someone contributes bindings for some other package we'll likely use a separate channel/package name for those and again try and maintain the current behavior. NATTEN's been on pypi for about two years now if I remember correctly, and the versioning and how wheels are delivered has remained unchanged.

As long as users pick the correct version of NATTEN according to their version of PyTorch, and also make sure that they're compatible, there should be no issues, and the standard "formula" is:

pip3 install natten==${NATTEN_VERSION}+torch${TORCH_VERSION_AND_TAG} -f https://shi-labs.com/natten/wheels/

Or

pip3 install natten==${NATTEN_VERSION}+torch${TORCH_VERSION}${TORCH_BACKEND_TAG} -f https://shi-labs.com/natten/wheels/

For example, for torch 2.4.0 tagged with cu124 (meaning it was compiled with CTK 12.4), installing NATTEN v0.17.1 would look like this:

pip3 install natten==0.17.1+torch240cu124 -f https://shi-labs.com/natten/wheels/

Obviously this will fail to even find the wheels if the NATTEN version is incompatible with the torch version, and doing a simple search of the wheels index will help resolve that.

However, the pip install can still succeed if the user is on a different version/tag of PyTorch, and lead to runtime issues like dlopen failing, etc.

picagrad commented 1 month ago

Ok so to summarise, I can just rely on the wheels index to do a checkbased on the relveant version of torch and cuda, and then select the newest version of natten that will work with that so something like (assuming I wrote the helper functions somewhere lol):

import subprocess
import torch
from my_url_utils import get_link_list, newest_appropriate

SHI_LAB_WHEELS = https://shi-labs.com/natten/wheels

torch_ver  = torch.__version__
cuda_ver = torch.version.cuda

links = get_link_list(SHI_LAB_WHEELS) # This will have some sort of format containing link.cuda_version link.torch_version and link.natten_version

# Find newest natten version that matches the torch and cuda version
final_vers = newest_appropriate(links, cuda_ver, torch_ver) # this will be a string containing the appropriate version

subprocess.check_call([sys.executable, '-m', 'pip', 'install' ,final_vers,'-f', SHI_LAB_WHEELS])

There might be some kinks to workout regarding the torch installation tags but I'm going to put those asside for now and cross that bug when I get to it...

alihassanijr commented 1 month ago

Yes exactly. We typically try and support at least 3 and at most 5 most recent torch releases (major not minor, since wheels should be compatible between different minor updates), and we support all python versions and CTK tags that each of those torch releases support. For example torch added Python 3.12 support in either 2.2 or 2.3, so we added it since then as well, and same with the recent CTK 12.4 wheels.

The only real exception you might have to look out for is probably just the version of torch being too old for the version of NATTEN.