Install Pytorch through pip instead of Conda

jllllll commented 1 year ago

Installation through Conda seems to be fairly unreliable. People have ended up with the CPU-only build of Pytorch far too many times, even when explicitly telling Conda to install the CUDA 11.7 build. There have also been instances where RWKV will not work with the Conda package.

Installing Pytorch through pip will allow for some initial groundwork for basic AMD ROCm support in the future. Theoretically, it may also allow for out-of-the-box MPS (Metal) GPU acceleration in Pytorch on supported MacOS versions.

I don't think abandoning Conda altogether is an option as it is used to install CUDA Toolkit, Git and Python for systems that don't have those already.

What are your thoughts on this?

Included changes:

Install Pytorch through pip instead of Conda
Fix for an issue building exllama in Linux/WSL
Use 'cuda' Conda package instead of 'cuda-toolkit'
- This conforms to the official CUDA Toolkit installation instructions.
Upgrade Miniconda version to 23.3.1
Commented out an old bitsandbytes fix for Linux that no longer seems to be necessary

oobabooga commented 1 year ago

I agree with these changes. Pytorch's official website sets pip as the default installation option, so I think that's what they want everyone to do: https://pytorch.org/get-started/locally/

Use 'cuda' Conda package instead of 'cuda-toolkit'

Does that install nvcc?

I don't think abandoning Conda altogether is an option as it is used to install CUDA Toolkit, Git and Python for systems that don't have those already.

That seems like a good option.

jllllll commented 1 year ago

@oobabooga Yes, it does include nvcc.

cuda is a meta-package that installs cuda-toolkit and cuda-runtime.

I'm pretty sure that cuda-runtime is a dependency of one of the packages in cuda-toolkit, so this should be functionally the same as what we were doing before. I included the change just in case it is fixing something I'm not aware of.

I do know that cuda-toolkit doesn't work when compiling cuda code in GitHub Actions. Only cuda works for that, so it is clearly a more complete installation in some way.

oobabooga commented 1 year ago

Tested on Linux and Windows and everything worked.

danpetry commented 1 year ago

People have ended up with the CPU-only build of Pytorch far too many times, even when explicitly telling Conda to install the CUDA 11.7 build.

It's possible that the solver installs the CPU version afterwards instead to satisfy compatibility with some other package. Are you able to point me towards any specific cases where this has happened, including what conda commands were used if possible?

jllllll commented 1 year ago

@danpetry I'm searching through the old issues for instances of this happening. There are a lot more cases of this that were reported and fixed in a Discord server, so this isn't everything, just what I could find here. Since this PR was merged, issues like this stopped being reported as far as I can tell.

https://github.com/oobabooga/text-generation-webui/issues/645 https://github.com/oobabooga/text-generation-webui/issues/2739 https://github.com/oobabooga/one-click-installers/issues/80 This was a weird one: https://github.com/oobabooga/text-generation-webui/issues/1969

There was also an issue reported on Discord in which xformers was unable to recognize the Conda Pytorch package and would only function with the pip package. This was only ever reported once, so it may have been an isolated case.

It's worth noting that these issues happen very inconsistently. It only ever happened to me once in the many times I have reinstalled the Conda packages and worked correctly after retrying a second time. These are the commands that were used in the past since I started contributing:

python=3.10.9 torchvision torchaudio pytorch-cuda=11.7 cuda-toolkit conda-forge::ninja conda-forge::git -c pytorch -c nvidia/label/cuda-11.7.0 -c nvidia
python=3.10.9 pytorch[version=2,build=py3.10_cuda11.7*] torchvision torchaudio pytorch-cuda=11.7 cuda-toolkit ninja git -c pytorch -c nvidia/label/cuda-11.7.0 -c nvidia
pytorch[version=2,build=py3.10_cuda11.7*] torchvision torchaudio pytorch-cuda=11.7 cuda-toolkit ninja git -c pytorch -c nvidia/label/cuda-11.7.0 -c nvidia

The first command is missing pytorch due to an issue oobabooga was having before I became involved. Note that, even with specifically targeting the cuda package, the CPU-only version was still installed occasionally. It didn't seem to make any difference to the frequency of reports.

danpetry commented 1 year ago

Ok, thanks for this info. We'll take it away and have a look at it.

atalman commented 1 year ago

@jllllll @danpetry For PyTorch installation with Conda please use commands from: https://pytorch.org/get-started/locally/ or from here for previous versions install: https://pytorch.org/get-started/previous-versions/

This is an example for Linux:

conda install pytorch torchvision torchaudio pytorch-cuda=11.7 -c pytorch -c nvidia

It should install latest pytorch, torchvision and audio with all required cuda libraries. Please note cuda-toolkit is not required to install Pytorch, pytorch-cuda package contains all required cuda dependencies: https://github.com/pytorch/builder/blob/main/conda/pytorch-cuda/meta.yaml

oobabooga / one-click-installers

Install Pytorch through pip instead of Conda #84