Open juud79 opened 1 month ago
You need to point to the file (xxxx.gguf
), not the directory containing the file.
You need to point to the file (
xxxx.gguf
), not the directory containing the file.
model is consist of 2 gguf files.. how can i do that?
sharded gguf (a model in multiple files) is not currently supported. #420 adds support but we need to fix something else related to ggufs first.
Experimental support of multiple gguf files is added to the dev
branch, please test if it works according to the documentation
Your current environment
I excute that command below:
python -m aphrodite.endpoints.openai.api_server --model /root/.cache/huggingface/hub/gguf/ --quantization gguf --gpu-memory-utilization 0.35 --max-model-len 4096 --port 8000
there is an error that OSError: /root/.cache/huggingface/hub/gguf/ does not appear to have a file named config.json. Checkout 'https://huggingface.co//root/.cache/huggingface/hub/gguf//tree/None' for available files.
why there wants config.json? as you know gguf format doesn't have a config.json...
🐛 Describe the bug
I excute that command below:
python -m aphrodite.endpoints.openai.api_server --model /root/.cache/huggingface/hub/gguf/ --quantization gguf --gpu-memory-utilization 0.35 --max-model-len 4096 --port 8000
there is an error that OSError: /root/.cache/huggingface/hub/gguf/ does not appear to have a file named config.json. Checkout 'https://huggingface.co//root/.cache/huggingface/hub/gguf//tree/None' for available files.
why there wants config.json? as you know gguf format doesn't have a config.json...