instructlab / instructlab

InstructLab Command-Line Interface. Use this to chat with a model and execute the InstructLab workflow to train a model using custom taxonomy data.
https://instructlab.ai
Apache License 2.0
974 stars 334 forks source link

InstructLab Running Only on CPU, Not Utilizing GPU (Windows 11) #1313

Closed HZapperz closed 1 month ago

HZapperz commented 5 months ago

Discussed in https://github.com/instructlab/instructlab/discussions/1312

Originally posted by **HZapperz** June 8, 2024 Hi everyone, I'm trying to get InstructLab to utilize my GPU, but so far, it's only using the CPU. Here are the details of my setup and the steps I've taken: #### System Configuration: - **OS:** Windows 11 - **CPU:** Intel i7-13700HX (16 cores, 24 logical processors) - **GPU:** NVIDIA GeForce RTX 4060 Laptop GPU (8GB RAM) - **CUDA Version:** 12.5 - **cuDNN Version:** 9.2 #### Steps Taken: 1. **Installed CUDA and cuDNN:** - Downloaded and installed CUDA 12.5. - Downloaded and installed cuDNN 9.2, and copied the `bin`, `include`, and `lib` directories to the corresponding CUDA directories. 2. **Verified CUDA Installation:** - Ran `nvcc --version` to check the CUDA installation. - Added CUDA paths to the system's environment variables. 3. **Installed PyTorch with CUDA Support:** - Installed PyTorch with the following command: ```bash pip install torch torchvision torchaudio --index-url https://download.pytorch.org/whl/cu121 ``` 4. **Verified GPU Access in PyTorch:** - Created and ran a `check_cuda.py` script to verify that PyTorch can access the GPU: ```python import torch print("CUDA available:", torch.cuda.is_available()) print("CUDA version:", torch.version.cuda) print("Torch version:", torch.__version__) print("GPU count:", torch.cuda.device_count()) print("GPU name:", torch.cuda.get_device_name(0) if torch.cuda.is_available() else "None") ``` - The script confirmed that CUDA is available and the GPU is accessible. 5. **Updated InstructLab Configuration:** - Modified `config.yaml` to set `gpu_layers` for GPU utilization: ```yaml serve: gpu_layers: 20 host_port: 127.0.0.1:8000 max_ctx_size: 4096 model_path: models/merlinite-7b-lab-Q4_K_M.gguf ``` 6. **Served the Model:** - Activated the virtual environment and ran `ilab serve`: ```bash cd C:\Users\hamza\Desktop\MeowVC\instructlab .\venv\Scripts\Activate ilab serve ``` 7. **Monitored GPU Utilization:** - Ran `nvidia-smi -l 5` to monitor GPU utilization. Despite these efforts, the model only uses the CPU. The GPU utilization remains at 0% according to `nvidia-smi`. #### Additional Information: - **NVIDIA Driver Version:** 555.85 - **PyTorch Version:** 2.3.1+cu121 I would greatly appreciate any guidance or suggestions to resolve this issue. Has anyone successfully run InstructLab with GPU utilization on a similar setup? Are there any specific configurations or steps I might be missing? Thank you in advance for your help!
tiran commented 5 months ago

ilab serve uses llama-cpp-python for inference. You have to build and install llama-cpp-python with CUDA support.

sumanair commented 5 months ago

@HZapperz perhaps this will help?

pip cache remove llama_cpp_python
pip install --force-reinstall llama_cpp_python==0.2.75 -C cmake.args="-DLLAMA_CUBLAS=on"
pip install instructlab -C cmake.args="-DLLAMA_CUBLAS=on"

https://developer.ibm.com/tutorials/awb-installing-instructlab-on-a-gaming-pc Also mentioned here https://developer.ibm.com/tutorials/awb-synth-train-contribute-instructlab-submission/

github-actions[bot] commented 2 months ago

This issue has been automatically marked as stale because it has not had activity within 90 days. It will be automatically closed if no further activity occurs within 30 days.

github-actions[bot] commented 1 month ago

This issue has been automatically closed due to inactivity. Please feel free to reopen if you feel it is still relevant!