void-linux / void-packages

The Void source packages collection
https://voidlinux.org
Other
2.5k stars 2.11k forks source link

[Package request] Nvidia CUDA Kit #20077

Open ghost opened 4 years ago

ghost commented 4 years ago

I am looking to install the nvidia CUDA toolkit to do some CUDA C programming. Is that available?

daniel-eys commented 4 years ago

CUDA is currently not packaged.

ghost commented 3 years ago

Okay, so I found out how to install it. I downloaded the .run file from the website for Ubuntu (or Debian I think) and ran it like so: sudo sh *.run --silent --toolkit --samples --samplespath=/home/david/cuda-test/ I did silent so it does not throw an error on ncurses. I installed the toolkit as I think the driver is the regular nvidia driver (could be wrong on this). I installed the samples and provided the path so I can look through them. I think the important one is --silent as otherwise it complains about ncurses.

It will install silently in the background with logs at /var/log/cuda-install somewhere.

It will install cuda at /usr/local/cuda* (It will install it at /usr/local/cuda but then symlink it from the version (for me it was cuda 11.1).

You can then add cuda to your path or use the full path to compile stuff. With path updated it becomes: nvcc -o hello hello.cu

Without path updated it becomes: sudo /usr/local/cuda/bin/nvcc -o hello hello.cu

Hopes this helps people trying to install CUDA on Void. I am trying to play around with Python Machine Learning libraries that require this to run so there might be some more steps. I know Tensorflow wants very specific versions of CUDA so I might have to uninstall it and reinstall another version.

I figured a lot of this stuff out by running the installer with the help flag. cuda_11.1.1_455.32.00_linux.run --help and someone online was nice enough to point out that it wants ncurses to do a graphical install and nvidia does not like binary packages which void gives so we are forced to use the --silent flag to install it via CLI (not that it matters, I do not really like graphical installs anyways.)

If you are having trouble with this, another way of getting CUDA on void is to use the nvidia-cuda docker image (but I would assume you then need some configuring to get your GPU to play nice with it -- not sure on this).

ghost commented 3 years ago

Unforunately using pyenv or similar tools seems iffy as they can not use pip due to an OpenSSL versus Libressl issue.

WormChickenWizard commented 3 years ago

Okay, so I found out how to install it. I downloaded the .run file from the website for Ubuntu (or Debian I think) and ran it like so: sudo sh *.run --silent --toolkit --samples --samplespath=/home/david/cuda-test/

I did a fresh install of the glibc base version, installed the Nvidia driver from the nonfree repo and after running the cuda ubuntu installer like demonstrated, it errored out for me.

It would be nice if cuda had its own package for simplicity's sake.

github-actions[bot] commented 2 years ago

Issues become stale 90 days after last activity and are closed 14 days after that. If this issue is still relevant bump it or assign it.

WormChickenWizard commented 2 years ago

Bump

soanvig commented 2 years ago

We have nvidia driver in the repository, and we have hashcat. Hashcat in this case however requires CUDA. Just "bumping" this, but with a context at least.

jjc224 commented 2 years ago

Literally came here for the same reason as @soanvig. Would be great to have this.

Juggernaut-Coder commented 1 year ago

Is there any possibility of getting CUDA on Repository?

namgo commented 11 months ago

To anyone who stumbles across this or https://github.com/void-linux/void-packages/issues/31984 it might be helpful to look at what slackbuilds does https://github.com/Ponce/slackbuilds/blob/f7519a395ed941f7ae6fa59051a1bcd770821b1a/development/cudatoolkit/cudatoolkit.SlackBuild which is to add --noexec to the .run and then unpack everything manually?

I've got a void x86_64-glibc machine that I would very much like to have CUDA on, I'm a little hesitant to try anything that provides breaking changes to it today, but if nobody gets on this in the next few days, I will.

mcneb10 commented 10 months ago

Also to anyone installing this, the installer needs ncurses 5. I had to make the symlinks for version 5 from the version of ncurses I had installed (6). Other than that the driver works flawlessly so far. I had to install libtinfo as well make symlinks for that as well

mcneb10 commented 10 months ago

The installer (extracted from the installer package) dynamically loads libncursesw, libformw, and libtinfo so one must make version 5 symlinks for those libs

mcneb10 commented 10 months ago

I also had to add myself to the video group to get OpenGL to work

namgo commented 9 months ago

If anyone's still struggling with this (as I am) the cuda installer when extracted provides directories that can be merged into an "outdir"

NVCC=/opt/cuda-extract/outdir/bin/nvcc LD_LIBRARY_PATH=$LD_LIBRARY_PATH:/opt/cuda-extract/outdir/lib64/ INCLUDES="-I /opt/cuda-extract/outdir/include/ " PATH=$PATH:/opt/cuda-extract/outdir/bin/ NVVMIR_LIBRARY_DIR=/opt/cuda-extract/outdir/nvvm/libdevice/ CUDA_PATH=/opt/cuda-extract/outdir/targets/x86_64-linux/ (command) at least points nvcc to the right place.

This is isn't directly relevant to packaging as I don't think we can legally redistribute extracted cuda binaries.

Edit: maybe we can?? https://archlinux.org/packages/extra/x86_64/cuda/

edit: I managed to get a python package to compile its cuda-required code successfully using the env vars above!

mcneb10 commented 9 months ago

Perhaps I could write and package and a script that automatically downloads the driver and prompts the user to accept the license agreement

namgo commented 9 months ago

@mcneb10 I'm asking on irc today, but given how archlinux packages cuda, it MIGHT be possible to distribute the binaries/libraries/headers as their own package.

One nice thing about the CUDA bins/libs/headers is that they're really just that. It appears that the driver that void-linux ships with is cuda-compatible. (note: this is not a well-tested statement and I'm questioning myself - see next comment).

I'll update when I hear back from others on how to proceed.

The EULA is confusing to me, otoh we're legally able to distribute the SDK, but only if it's used by our "application". This makes me think I should contact one of the arch linux packagers to see what they do.

Meanwhile, I'll spend some time and figure out the least ugly hack to re-package CUDA under void's templating. Even if it ends up being a licensing nightmare, it'd be good to document how to do it.

namgo commented 9 months ago

This comment is less important than the former, but:

https://docs.nvidia.com/cuda/cuda-toolkit-release-notes/index.html#id5 is an important reference for driver versions to cuda versions if we're bundling them separately. Nvidia driver is 535.129(.03?) as of Dec 5, 2023 which corresponds to CUDA 12.2 - I had downloaded 12.3 and was encountering:

cuLinkAddData(): the provided PTX was compiled with an unsupported toolchain. running hashcat as a generally available quick test.

I think this is important for matching CUDA versions to drivers if we package (https://developer.nvidia.com/cuda-toolkit-archive).

We can also always package a nvidia-cuda-download-helper script if the cuda licensing becomes an issue.

Here's an example of how nvidia-cuda-download-helper might look: https://gist.github.com/namgo/9aeeee2025b341779be85e596822893f - I've tested the result with hashcat briefly but haven't tried building pytorch with it yet. If CUDA licensing is really an issue, a restricted package might be necessary then? Still waiting to hear back, but the above script seems to be a start.

atisharma commented 7 months ago

I would LOVE to have cuda packaged but at the moment cuda SDK requires gcc <= 12 and gcc 13 seems to be in the repos. How would that be handled?

namgo commented 6 months ago

@atisharma I didn't consider that, good point!

I think that a temporary solution for those desperate would be to install a non-system-wide gcc12 and a non-system-wide cuda, making use of environment variables?

So:

But that's not something that one can/should package, so I'm pretty stuck still.

jlhamilton777 commented 5 months ago

It looks like GCC 13.2 is supported now https://docs.nvidia.com/cuda/cuda-installation-guide-linux/#host-compiler-support-policy

atisharma commented 1 month ago

I found the easiest way to proceed is to use docker CUDA images and nvidia-container-toolkit.

davidSimek commented 1 week ago

If anyone wants to use it to run stable diffusion webui, I made this: Stable Diffusion Webui docker setup. If you need it for anything else, you can change Dockerfile and make it do anything you need. It is inspired by last comment from @atisharma.