Closed jllllll closed 1 year ago
I agree with these changes. Pytorch's official website sets pip as the default installation option, so I think that's what they want everyone to do: https://pytorch.org/get-started/locally/
Use 'cuda' Conda package instead of 'cuda-toolkit'
Does that install nvcc
?
I don't think abandoning Conda altogether is an option as it is used to install CUDA Toolkit, Git and Python for systems that don't have those already.
That seems like a good option.
@oobabooga Yes, it does include nvcc
.
cuda
is a meta-package that installs cuda-toolkit
and cuda-runtime
.
I'm pretty sure that cuda-runtime
is a dependency of one of the packages in cuda-toolkit
, so this should be functionally the same as what we were doing before. I included the change just in case it is fixing something I'm not aware of.
I do know that cuda-toolkit
doesn't work when compiling cuda code in GitHub Actions. Only cuda
works for that, so it is clearly a more complete installation in some way.
Tested on Linux and Windows and everything worked.
People have ended up with the CPU-only build of Pytorch far too many times, even when explicitly telling Conda to install the CUDA 11.7 build.
It's possible that the solver installs the CPU version afterwards instead to satisfy compatibility with some other package. Are you able to point me towards any specific cases where this has happened, including what conda
commands were used if possible?
@danpetry I'm searching through the old issues for instances of this happening. There are a lot more cases of this that were reported and fixed in a Discord server, so this isn't everything, just what I could find here. Since this PR was merged, issues like this stopped being reported as far as I can tell.
https://github.com/oobabooga/text-generation-webui/issues/645 https://github.com/oobabooga/text-generation-webui/issues/2739 https://github.com/oobabooga/one-click-installers/issues/80 This was a weird one: https://github.com/oobabooga/text-generation-webui/issues/1969
There was also an issue reported on Discord in which xformers was unable to recognize the Conda Pytorch package and would only function with the pip package. This was only ever reported once, so it may have been an isolated case.
It's worth noting that these issues happen very inconsistently. It only ever happened to me once in the many times I have reinstalled the Conda packages and worked correctly after retrying a second time. These are the commands that were used in the past since I started contributing:
python=3.10.9 torchvision torchaudio pytorch-cuda=11.7 cuda-toolkit conda-forge::ninja conda-forge::git -c pytorch -c nvidia/label/cuda-11.7.0 -c nvidia
python=3.10.9 pytorch[version=2,build=py3.10_cuda11.7*] torchvision torchaudio pytorch-cuda=11.7 cuda-toolkit ninja git -c pytorch -c nvidia/label/cuda-11.7.0 -c nvidia
pytorch[version=2,build=py3.10_cuda11.7*] torchvision torchaudio pytorch-cuda=11.7 cuda-toolkit ninja git -c pytorch -c nvidia/label/cuda-11.7.0 -c nvidia
The first command is missing pytorch
due to an issue oobabooga was having before I became involved.
Note that, even with specifically targeting the cuda package, the CPU-only version was still installed occasionally. It didn't seem to make any difference to the frequency of reports.
Ok, thanks for this info. We'll take it away and have a look at it.
@jllllll @danpetry For PyTorch installation with Conda please use commands from: https://pytorch.org/get-started/locally/ or from here for previous versions install: https://pytorch.org/get-started/previous-versions/
This is an example for Linux:
conda install pytorch torchvision torchaudio pytorch-cuda=11.7 -c pytorch -c nvidia
It should install latest pytorch, torchvision and audio with all required cuda libraries. Please note cuda-toolkit
is not required to install Pytorch, pytorch-cuda package contains all required cuda dependencies: https://github.com/pytorch/builder/blob/main/conda/pytorch-cuda/meta.yaml
Installation through Conda seems to be fairly unreliable. People have ended up with the CPU-only build of Pytorch far too many times, even when explicitly telling Conda to install the CUDA 11.7 build. There have also been instances where RWKV will not work with the Conda package.
Installing Pytorch through pip will allow for some initial groundwork for basic AMD ROCm support in the future. Theoretically, it may also allow for out-of-the-box MPS (Metal) GPU acceleration in Pytorch on supported MacOS versions.
I don't think abandoning Conda altogether is an option as it is used to install CUDA Toolkit, Git and Python for systems that don't have those already.
What are your thoughts on this?
Included changes: