PygmalionAI / aphrodite-engine

PygmalionAI's large-scale inference engine
https://pygmalion.chat
GNU Affero General Public License v3.0
606 stars 78 forks source link

[Bug]: gguf loading failed. config.json? #417

Open juud79 opened 1 month ago

juud79 commented 1 month ago

Your current environment

I excute that command below:

python -m aphrodite.endpoints.openai.api_server --model /root/.cache/huggingface/hub/gguf/ --quantization gguf --gpu-memory-utilization 0.35 --max-model-len 4096 --port 8000

there is an error that OSError: /root/.cache/huggingface/hub/gguf/ does not appear to have a file named config.json. Checkout 'https://huggingface.co//root/.cache/huggingface/hub/gguf//tree/None' for available files.

why there wants config.json? as you know gguf format doesn't have a config.json...

🐛 Describe the bug

I excute that command below:

python -m aphrodite.endpoints.openai.api_server --model /root/.cache/huggingface/hub/gguf/ --quantization gguf --gpu-memory-utilization 0.35 --max-model-len 4096 --port 8000

there is an error that OSError: /root/.cache/huggingface/hub/gguf/ does not appear to have a file named config.json. Checkout 'https://huggingface.co//root/.cache/huggingface/hub/gguf//tree/None' for available files.

why there wants config.json? as you know gguf format doesn't have a config.json...

sgsdxzy commented 1 month ago

You need to point to the file (xxxx.gguf), not the directory containing the file.

juud79 commented 1 month ago

You need to point to the file (xxxx.gguf), not the directory containing the file.

model is consist of 2 gguf files.. how can i do that?

sgsdxzy commented 3 weeks ago

sharded gguf (a model in multiple files) is not currently supported. #420 adds support but we need to fix something else related to ggufs first.

sgsdxzy commented 2 weeks ago

Experimental support of multiple gguf files is added to the dev branch, please test if it works according to the documentation