hyperonym / basaran

Basaran is an open-source alternative to the OpenAI text completion API. It provides a compatible streaming API for your Hugging Face Transformers-based text generation models.
MIT License
1.29k stars 81 forks source link

Llama 2 models not working - how to pass auth token? #232

Open arsaboo opened 11 months ago

arsaboo commented 11 months ago

I am trying to run the llama 2 models and here is the command and the logs:

sudo docker run -p 80:80 -e MODEL=meta-llama/Llama-2-7b-hf hyperonym/basaran:0.19.0
Traceback (most recent call last):
  File "/usr/local/lib/python3.8/dist-packages/huggingface_hub/utils/_errors.py", line 259, in hf_raise_for_status
    response.raise_for_status()
  File "/usr/local/lib/python3.8/dist-packages/requests/models.py", line 1021, in raise_for_status
    raise HTTPError(http_error_msg, response=self)
requests.exceptions.HTTPError: 401 Client Error: Unauthorized for url: https://huggingface.co/meta-llama/Llama-2-7b-hf/resolve/main/tokenizer_config.json

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "/usr/local/lib/python3.8/dist-packages/transformers/utils/hub.py", line 417, in cached_file
    resolved_file = hf_hub_download(
  File "/usr/local/lib/python3.8/dist-packages/huggingface_hub/utils/_validators.py", line 118, in _inner_fn
    return fn(*args, **kwargs)
  File "/usr/local/lib/python3.8/dist-packages/huggingface_hub/file_download.py", line 1195, in hf_hub_download
    metadata = get_hf_file_metadata(
  File "/usr/local/lib/python3.8/dist-packages/huggingface_hub/utils/_validators.py", line 118, in _inner_fn
    return fn(*args, **kwargs)
  File "/usr/local/lib/python3.8/dist-packages/huggingface_hub/file_download.py", line 1541, in get_hf_file_metadata
    hf_raise_for_status(r)
  File "/usr/local/lib/python3.8/dist-packages/huggingface_hub/utils/_errors.py", line 291, in hf_raise_for_status
    raise RepositoryNotFoundError(message, response) from e
huggingface_hub.utils._errors.RepositoryNotFoundError: 401 Client Error. (Request ID: Root=1-64ba9aef-0847ce6e5dbe16fd46aae799)

Repository Not Found for url: https://huggingface.co/meta-llama/Llama-2-7b-hf/resolve/main/tokenizer_config.json.
Please make sure you specified the correct `repo_id` and `repo_type`.
If you are trying to access a private or gated repo, make sure you are authenticated.
Invalid username or password.

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/usr/lib/python3.8/runpy.py", line 194, in _run_module_as_main
    return _run_code(code, main_globals, None,
  File "/usr/lib/python3.8/runpy.py", line 87, in _run_code
    exec(code, run_globals)
  File "/app/basaran/__main__.py", line 41, in <module>
    stream_model = load_model(
  File "/app/basaran/model.py", line 319, in load_model
    tokenizer = AutoTokenizer.from_pretrained(name_or_path, **kwargs)
  File "/usr/local/lib/python3.8/dist-packages/transformers/models/auto/tokenization_auto.py", line 643, in from_pretrained
    tokenizer_config = get_tokenizer_config(pretrained_model_name_or_path, **kwargs)
  File "/usr/local/lib/python3.8/dist-packages/transformers/models/auto/tokenization_auto.py", line 487, in get_tokenizer_config
    resolved_config_file = cached_file(
  File "/usr/local/lib/python3.8/dist-packages/transformers/utils/hub.py", line 433, in cached_file
    raise EnvironmentError(
OSError: meta-llama/Llama-2-7b-hf is not a local folder and is not a valid model identifier listed on 'https://huggingface.co/models'
If this is a private repository, make sure to pass a token having permission to this repo with `use_auth_token` or log in with `huggingface-cli login` and pass `use_auth_token=True`.

I have been granted access to those models by both Meta and HF and I am logged in using huggingface_cli

$ /home/arsaboo/.local/bin/huggingface-cli whoami
arsaboo
KastanDay commented 9 months ago

This one works for me, ensure you fill out the "Model Request form" on Facebook's Llama page. You MUST be approved by facebook before they let you use this model.

MODEL=meta-llama/Llama-2-7b-chat-hf