Open 0wwafa opened 3 days ago
model available at: https://huggingface.co/ZeroWw/Llama-3-8B-Instruct-Gradient-1048k-GGUF
You are loading the model with it's default context setting of 1048576, this takes up A LOT of ram (~128GB just for the context + the model itself) and will take a long time, also you are doing it on the CPU only...
The logs indicate no errors, and you manually stopped the model with Ctrl+C, so it was probably still loading.
It takes over a minute to load the model with just 300k context for me.
Try it with a lower context settings or wait longer....
What happened?
The model to work.
Name and Version
version: 3222 (48e6b92c) built with cc (Ubuntu 11.4.0-1ubuntu1~22.04) 11.4.0 for x86_64-linux-gnu
What operating system are you seeing the problem on?
Linux
Relevant log output