mudler / LocalAI

:robot: The free, Open Source alternative to OpenAI, Claude and others. Self-hosted and local-first. Drop-in replacement for OpenAI, running on consumer-grade hardware. No GPU required. Runs gguf, transformers, diffusers and many more models architectures. Features: Generate Text, Audio, Video, Images, Voice Cloning, Distributed inference
https://localai.io
MIT License
23.43k stars 1.79k forks source link

Can't download model file from huggingface using link in model.yaml file #3100

Closed JackBekket closed 1 month ago

JackBekket commented 1 month ago

LocalAI version: gpu-aio-cuda-12 (latest)

Environment, CPU architecture, OS, and Version: Ubuntu 22.04

Describe the bug Can't download models from huggingface

To Reproduce add model .yaml file:

name: qwen14b   # understand english, russian, chinese
context_size: 32000  # max is 32k context but I got out of mem on my machine if I use it in full force.
f16: false # true to GPU acceleration
cuda: false # true to GPU acceleration
gpu_layers: 0 # 0 is CPU only
parameters:
  model: huggingface://Qwen/Qwen1.5-14B-Chat-GGUF/blob/main/qwen1_5-14b-chat-q5_0.gguf
stopwords:
- "HUMAN:"
cutstrings:
- "<|im_end|>"
template:

  chat: &template |
    Below is an instruction that describes a task. Write a response that appropriately completes the request.
    Instruction: {{.Input}}
    Response:
  # Modify the prompt template here ^^^ as per your requirements
  completion: *template

Expected behavior Download models from hugginface correctly

Logs

3:36PM INF Downloading "https://huggingface.co/Qwen/Qwen1.5-14B-Chat-GGUF/resolve/main/blob"
3:36PM ERR error downloading models error="failed to download url \"https://huggingface.co/Qwen/Qwen1.5-14B-Chat-GGUF/resolve/main/blob\", invalid status code 404"

Additional context actual direct link to model file is: https://huggingface.co/Qwen/Qwen1.5-14B-Chat-GGUF/resolve/main/qwen1_5-14b-chat-q5_0.gguf

I've also tried to paste direct link to model in config, but it's also fails

Also I've tried to download_files : [link] but it's also fail because it's can't unmarshal this.

JackBekket commented 1 month ago

nvm, if I put link to hugginface like this huggingface://Qwen/Qwen1.5-14B-Chat-GGUF/qwen1_5-14b-chat-q5_0.gguf then it will download correctly

mcd01 commented 1 week ago

Apologies for responding to this issue, but I am facing a similar issue and was hoping that you or someone else can provide additional insights.

I am working through the docs on GPT and try to get it working with the transformers backend. Precisely, I am testing the following snippet from the docs:

name: transformers
backend: transformers
parameters:
    model: "facebook/opt-125m"
type: AutoModelForCausalLM
quantization: bnb_4bit

I also made sure to enable the respective backend and I can see that it is prepared on startup:

EXTRA_BACKENDS: backend/python/transformers
Preparing backend: backend/python/transformers
make: Entering directory '/build/backend/python/transformers'
bash install.sh
Initializing libbackend for transformers
virtualenv activated
activated virtualenv has been ensured
starting requirements install for /build/backend/python/transformers/requirements.txt
Audited 4 packages in 73ms
finished requirements install for /build/backend/python/transformers/requirements.txt
make: Leaving directory '/build/backend/python/transformers'

However, when trying to interact with the model, e.g., via the completions API endpoint, I am getting the following response:

{
  "error": {
    "code": 500,
    "message": "could not load model (no success): Unexpected err=OSError(\"Can't load the model for 'facebook/opt-125m'. If you were trying to load it from 'https://huggingface.co/models', make sure you don't have a local directory with the same name. Otherwise, make sure 'facebook/opt-125m' is the correct path to a directory containing a file named pytorch_model.bin, tf_model.h5, model.ckpt or flax_model.msgpack.\"), type(err)=<class 'OSError'>",
    "type": ""
  }
}

This is somewhat irritating to me as I precisely follow the docs. I also tried to specify the model path in a different way as suggested in this issue, like huggingface://facebook/opt-125m/tf_model.h5, which successfully downloads the single file but then complains about an invalid model identifier:

{
  "error": {
    "code": 500,
    "message": "could not load model (no success): Unexpected err=OSError(\"tf_model.h5 is not a local folder and is not a valid model identifier listed on 'https://huggingface.co/models'\\nIf this is a private repository, make sure to pass a token having permission to this repo either by logging in with `huggingface-cli login` or by passing `token=<your_token>`\"), type(err)=<class 'OSError'>",
    "type": ""
  }
}

LocalAI version: v2.20.1 Container image: localai/localai:latest-gpu-nvidia-cuda-12

Any help is appreciated! I am quite sure it is just a small thing on my end, but I already tried quite a bunch of combinations and none of them worked.