Closed anfedoro closed 2 months ago
It's probably loading the model from disk. If you don't like this behavior, you can try --no-mmap
.
It's probably loading the model from disk. If you don't like this behavior, you can try
--no-mmap
.
Thanks.. it works.. but even if so... why it load faster to VRAM (~10 sec), than first mapping to RAM (200 sec) ? I reads from the same disk.
Technically guide say that with the --no-mmap
load can be slow.. but in my case it is opposite.
This issue was closed because it has been inactive for 14 days since being marked as stale.
What happened?
Once ./main -m model.gguf -ngl 33 starting it is see the GPU.. produce all the output about models, but STOP at some moment for about 2-3 min This is from log. I marked the lien.. you may see that timestamt diff is about 200 sec [1717529683] llm_load_tensors: ggml ctx size = 0.30 MiB ----------------------------------- Here is a delay for about 2-3 min ------------------ [1717529875] llm_load_tensors: offloading 32 repeating layers to GPU
Then it quicky loads model to VRAM, providing rest of information: and then either stay interactive or generate output.. in accordance to other cli parameters.
What could be an issue ? While delay it actually does nothing.. neither CPU nor GPU load increase
nvidia-smi Tue Jun 4 22:44:58 2024 +---------------------------------------------------------------------------------------+ | NVIDIA-SMI 535.171.04 Driver Version: 552.22 CUDA Version: 12.4 | |-----------------------------------------+----------------------+----------------------+ | GPU Name Persistence-M | Bus-Id Disp.A | Volatile Uncorr. ECC | | Fan Temp Perf Pwr:Usage/Cap | Memory-Usage | GPU-Util Compute M. | | | | MIG M. | |=========================================+======================+======================| | 0 NVIDIA RTX A2000 On | 00000000:01:00.0 Off | Off | | 30% 43C P5 14W / 70W | 808MiB / 6138MiB | 10% Default | | | | N/A | +-----------------------------------------+----------------------+----------------------+
+---------------------------------------------------------------------------------------+ | Processes: | | GPU GI CI PID Type Process name GPU Memory | | ID ID Usage | |=======================================================================================| | No running processes found | +---------------------------------------------------------------------------------------+
Name and Version
version: 3083 (adc9ff38) built with cc (Ubuntu 11.4.0-1ubuntu1~22.04) 11.4.0 for x86_64-linux-gnu
What operating system are you seeing the problem on?
Other? (Please let us know in description)
Relevant log output