Closed yxy123 closed 1 year ago
It's hard to tell what's causing the error without all the details but it looks like you either don't have Cupy installed properly or you don't have the necessary CUDA drivers.
Could you describe your setup?
Are you running this on WSL? Because I've seen this error before and I was able to fix it by installing the proper drivers.
Hi, I‘m running on Linux. Cupy installation failure. Conda version: conda 4.10.1 [base]# lspci | grep VGA 01:00.1 VGA compatible controller: Matrox Electronics Systems Ltd. MGA G200EH
Did you follow these instructions: https://github.com/togethercomputer/OpenChatKit#requirements for installing Cupy and the other dependencies before trying to train the model?
You mentioned CPU support in the first comment; CPU-only can work for inference but not for training. The command you're running is for training the model, is that what you're trying to do?
Yes, I use CPU-only do trainning, you mean cpu-only couldn't support training, right?
Yes, I use CPU-only do trainning, you mean cpu-only couldn't support training, right?
No, OCK does not support fine tuning on CPUs. Adding support seems unnecessary too because it would take a very, very long time to fine tune a model on just a CPU.
You need GPUs to finetune the model. These are the requirements: (Source]
Model | Inference GPU memory | Fine-tuning GPU memory |
---|---|---|
GPT-NeoXT-Chat-Base-20B | 42 GB | 640 GB |
GPT-NeoXT-Chat-Base-20B-int8 | 21 GB | N/A |
Pythia-Chat-Base-7B-v0.16 | 18 GB | 256 GB |
Pythia-Chat-Base-7B-v0.16-int8 | 9 GB | N/A |
Got it, thanks for your support.
Run "bash training/finetune_GPT-NeoXT-Chat-Base-20B.sh" in OpenChatKit-cpu_support seems still need CUDA, the detail error log as below : ImportError:
Failed to import CuPy.
If you installed CuPy via wheels (cupy-cudaXXX or cupy-rocm-X-X), make sure that the package matches with the version of CUDA or ROCm installed.
On Linux, you may need to set LD_LIBRARY_PATH environment variable depending on how you installed CUDA/ROCm. On Windows, try setting CUDA_PATH environment variable.
Check the Installation Guide for details: https://docs.cupy.dev/en/latest/install.html
Original error: ImportError: libcuda.so.1: cannot open shared object file: No such file or directory