Open N1h1lv5 opened 1 year ago
if you use the nvidia-smi command, what is your vram usage?
Hi ! while running it :
`Wed Sep 6 22:39:05 2023 +---------------------------------------------------------------------------------------+ | NVIDIA-SMI 537.13 Driver Version: 537.13 CUDA Version: 12.2 | |-----------------------------------------+----------------------+----------------------+ | GPU Name TCC/WDDM | Bus-Id Disp.A | Volatile Uncorr. ECC | | Fan Temp Perf Pwr:Usage/Cap | Memory-Usage | GPU-Util Compute M. | | | | MIG M. | |=========================================+======================+======================| | 0 NVIDIA GeForce RTX 4060 Ti WDDM | 00000000:01:00.0 On | N/A | | 30% 42C P2 46W / 160W | 7870MiB / 8188MiB | 98% Default | | | | N/A | +-----------------------------------------+----------------------+----------------------+
+---------------------------------------------------------------------------------------+ | Processes: | | GPU GI CI PID Type Process name GPU Memory | | ID ID Usage | |=======================================================================================| | 0 N/A N/A 2884 C+G ...72.0_x648wekyb3d8bbwe\GameBar.exe N/A | | 0 N/A N/A 3512 C+G ...8wekyb3d8bbwe\WindowsTerminal.exe N/A | | 0 N/A N/A 7636 C+G ...siveControlPanel\SystemSettings.exe N/A | | 0 N/A N/A 8560 C ...\anaconda3\envs\localGPT\python.exe N/A | | 0 N/A N/A 9952 C+G C:\Windows\explorer.exe N/A | | 0 N/A N/A 10800 C+G ...2txyewy\StartMenuExperienceHost.exe N/A | | 0 N/A N/A 11008 C+G ...les (x86)\Battle.net\Battle.net.exe N/A | | 0 N/A N/A 12016 C+G ...les\Microsoft OneDrive\OneDrive.exe N/A | | 0 N/A N/A 12176 C+G ...t.LockApp_cw5n1h2txyewy\LockApp.exe N/A | | 0 N/A N/A 13348 C+G ...GeForce Experience\NVIDIA Share.exe N/A | | 0 N/A N/A 13636 C+G ...air\Corsair iCUE5 Software\iCUE.exe N/A | | 0 N/A N/A 14600 C+G ...CBS_cw5n1h2txyewy\TextInputHost.exe N/A | | 0 N/A N/A 15100 C+G ...inaries\Win64\EpicGamesLauncher.exe N/A | | 0 N/A N/A 15384 C+G C:\Program Files\NZXT CAM\NZXT CAM.exe N/A | | 0 N/A N/A 15672 C+G ...ne\Binaries\Win64\EpicWebHelper.exe N/A | | 0 N/A N/A 16296 C+G C:\Program Files\NZXT CAM\NZXT CAM.exe N/A | | 0 N/A N/A 18364 C+G ...Programs\Microsoft VS Code\Code.exe N/A | | 0 N/A N/A 19188 C+G ...5n1h2txyewy\ShellExperienceHost.exe N/A | | 0 N/A N/A 19720 C+G ...nt.CBS_cw5n1h2txyewy\SearchHost.exe N/A | | 0 N/A N/A 20388 C+G ...41.0_x64__zpdnekdrzrea0\Spotify.exe N/A | | 0 N/A N/A 23468 C+G ...oogle\Chrome\Application\chrome.exe N/A | +---------------------------------------------------------------------------------------+ `
Im pretty sure something is interfering with this card since other computer at my work run it well with the same specs more or less... they are also giving the cuda message i posted above, but the model is still okay, i can see by task manager the card is being used while giving text.
can a oobabooga installation have an effect related to this ?
So I managed to fix it, first reinstalled oobabooga with cuda support (I dont know if it influenced localGPT), then completely reinstalled localgpt and its environment.
EDIT : I read somewhere that there is a problem with allocating memory with the new Nvidia drivers, I am now using 537.13 but have to use 532.03 for it to work. There post I read said 531 were safe to use, while my 4060TI has only 532.03 because it was just released after 531.
I’m running docker on windows to use gptq model, response is slow though it is using 12GB GPU, what can be the reason, how to handle it ? Google colab uses 12GB GPU and it is fast. Model: Llama 2 7B chat GPTQ
I’m running docker on windows to use gptq model, response is slow though it is using 12GB GPU, what can be the reason, how to handle it ? Google colab uses 12GB GPU and it is fast. Model: Llama 2 7B chat GPTQ
hi,
have you managed to run this on google colab? Please can you share the details on runtime and the workbook if possible. I am trying to run in colab on T4 GPU with 12GB CPU and 15GB GPU RAM but it keeps crashing after entering the prompt with the following error :
Enter a query: how to elect american president ggml_allocr_alloc: not enough space in the buffer (needed 143278592, largest block available 17334272) GGML_ASSERT: ggml-alloc.c:139: !"not enough space in the buffer"
!pip install --upgrade tensorrt !git clone https://github.com/PromtEngineer/localGPT.git %cd localGPT !pip install -r requirements.txt !python ingest.py --device_type cuda !python run_localGPT.py --device_type cuda
!pip install --upgrade tensorrt !git clone https://github.com/PromtEngineer/localGPT.git %cd localGPT !pip install -r requirements.txt !python ingest.py --device_type cuda !python run_localGPT.py --device_type cuda
Thanks, but that doesn't work anymore on T4 GPU. I tried to upgrade to a better GPU on colab pro but to no avail. 👎
In constants.py file, change MODEL_ID to TheBloke/Llama-2-7b-Chat-GPTQ And MODEL_BASENAME to model.safetensors
So i ditched my RTX 4060 TI and moved to a RTX 4070, 8GB vs 12GB.
I dont get any answer from this model, it just hangs : MODEL_ID = "TheBloke/Llama-2-13B-GPTQ" MODEL_BASENAME = "model.safetensors"
And this model :
MODEL_ID = "TheBloke/vicuna-7B-v1.5-GPTQ"
MODEL_BASENAME = "model.safetensors"
just gives a blank answer... does anyone know what is happening ?
So I can confirm the models stopped working only because im now using run_localGPT_v2.py, when going back to run_localGPT.py its working again. Something for you @PromtEngineer ? thanks for the effort
@N1h1lv5 I hope with this new update, the issue is solved. Can you please confirm?
@N1h1lv5 I hope with this new update, the issue is solved. Can you please confirm?
The new run_localGPT.py is working, but some models still give empty answers, as you know.
I tired this docker file with CUDA 11.7 .. Observing error :
@PromtEngineer - Any suggestion, highly appreciated. Thanks in advance.
I don't know if anyone has tried it, but if you use GPTQ, there was a warning that says to remove the temperature. So I tried removing it, and everything works great.
Hi all ! model is working great ! i am trying to use my 8GB 4060TI with MODEL_ID = "TheBloke/vicuna-7B-v1.5-GPTQ" MODEL_BASENAME = "model.safetensors"
I changed the GPU today, the previous one was old.
But it takes a few minutes to get a result... however I notice now im getting this messages while running the model :
can someone tell me what is going on ?