Closed knkski closed 1 year ago
Yeah, bitsandbytes is a hard library, if I understand correctly it compiles for a specific "compute capability" like most cuda stuff, here's the logs from the much superior llama.cpp
ggml_init_cublas: found 1 CUDA devices:
Device 0: NVIDIA GeForce RTX 3090, compute capability 8.6
My hunch is that we would probably need to compile bitsandbytes
for the "compute capability" of the 1080ti which is probably not 8.6 like mine.
@knkski as another suggestion, here's my nvidia-smi
, try matching the cuda version and driver version and seeing what the runtime behavior is:
nvidia-smi
Sun Oct 22 14:32:32 2023
+-----------------------------------------------------------------------------+
| NVIDIA-SMI 525.89.02 Driver Version: 525.89.02 CUDA Version: 12.0 |
|-------------------------------+----------------------+----------------------+
| GPU Name Persistence-M| Bus-Id Disp.A | Volatile Uncorr. ECC |
| Fan Temp Perf Pwr:Usage/Cap| Memory-Usage | GPU-Util Compute M. |
| | | MIG M. |
|===============================+======================+======================|
| 0 NVIDIA GeForce ... Off | 00000000:01:00.0 Off | N/A |
| 0% 44C P8 23W / 350W | 21MiB / 24576MiB | 0% Default |
| | | N/A |
+-------------------------------+----------------------+----------------------+
+-----------------------------------------------------------------------------+
| Processes: |
| GPU GI CI PID Type Process name GPU Memory |
| ID ID Usage |
|=============================================================================|
| 0 N/A N/A 2168 G ...xorg-server-1.20.14/bin/X 10MiB |
| 0 N/A N/A 2277 G ...hell-43.3/bin/gnome-shell 8MiB |
+-----------------------------------------------------------------------------+
The GTX 10 series is CC 6.1. Below CC 7.5, bitsandbytes switches to the "nocublaslt" version of one of its libraries, which we don't include currently.
Modifying preBuild
for bitsandbytes should allow us to compile both variants of the library. The Makefile contains a bunch different targets, we probably want to match this to our CUDA version.
https://github.com/NixOS/nixpkgs/blob/nixos-unstable/pkgs/development/python-modules/bitsandbytes/default.nix#L69-L74 https://github.com/TimDettmers/bitsandbytes/blob/18e827d666fa2b70a12d539ccedc17aa51b2c97c/Makefile#L86
@MatthewCroughan @max-privatevoid Thanks for the quick responses! I was able to get it going with a workaround following @max-privatevoid's suggestion. I copied the upstream bitsandbytes default.nix file to a new packages/bitsandbytes/default.nix
file in this repo, then applied this diff:
- ''make CUDA_VERSION=${cudaVersion} cuda${cudaMajorVersion}x''
+ ''make CUDA_VERSION=${cudaVersion} cuda${cudaMajorVersion}x_nomatmul''
It compiled and is running successfully :tada:
I don't have enough experience here to polish that up into a proper PR, but I can at least confirm that it works.
Also, if anyone hits issues with the above, I previously got it working (albeit quite slowly) by just commenting out the bitsandbytes
dependency. Everything ran fine, except for obviously the models couldn't get quantized.
Closing this issue since there's a workaround and the fix probably needs to happen in upstream nixpkgs, but feel free to reopen if that's incorrect.
Reopened for https://github.com/nixified-ai/flake/pull/57
First off, thanks for the great project.
I'm trying to run the text-generation-webui project, and hitting a CUDA error. This is on an older card, a 1080 TI, so I possibly just need to upgrade it? Here's what I'm trying to do and the error:
Here's the output from
nvidia-smi
:If I run the
bitsandbytes
commands myself, the package is built and installs just fine, so I'm not sure why it's failing as part of the larger build.