huggingface / text-generation-inference

Large Language Model Text Generation Inference
http://hf.co/docs/text-generation-inference
Apache License 2.0
8.76k stars 1.02k forks source link

how to use the model's checkpoint in local fold? #2302

Open zk19971101 opened 1 month ago

zk19971101 commented 1 month ago

System Info

ghcr.io/huggingface/text-generation-inference 2.0.4 platform windows10 Docker version 27.0.3 llm model:lllyasviel/omost-llama-3-8b-4bits cuda 12.3 gpu nvidia rtx A6000

Information

Tasks

Reproduction

C:\Users\Administrator>docker run --gpus all -p 8080:80 -v ./data:/data ghcr.io/huggingface/text-generation-inference:2.0.4 --model-id "F:\Omost-main\checkpoints\models--lllyasviel--omost-llama-3-8b-4bits" --max-total-tokens 9216 --cuda-memory-fraction 0.8

Expected behavior

eventhought i set the model-id =, docker raise a error. 企业微信截图_20240725122625

danieldk commented 1 month ago

Did you try to remove the double dashes in the model name models--lllyasviel--omost-llama-3-8b-4bits as suggested in the error?

github-actions[bot] commented 2 weeks ago

This issue is stale because it has been open 30 days with no activity. Remove stale label or comment or this will be closed in 5 days.