How to authenticate to huggingface.co, from the run_localGPT.py script, using Docker?

$ docker run -it --mount src="$HOME/.cache",target=/root/.cache,type=bind  localgpt

==========
== CUDA ==
==========

CUDA Version 11.7.1

Container image Copyright (c) 2016-2023, NVIDIA CORPORATION & AFFILIATES. All rights reserved.

This container image and its contents are governed by the NVIDIA Deep Learning Container License.
By pulling and using the container, you accept the terms and conditions of this license:
https://developer.nvidia.com/ngc/nvidia-deep-learning-container-license

A copy of this license is made available in this container at /NGC-DL-CONTAINER-LICENSE for your convenience.

WARNING: The NVIDIA Driver was not detected.  GPU functionality will not be available.
   Use the NVIDIA Container Toolkit to start this container with GPU support; see
   https://docs.nvidia.com/datacenter/cloud-native/ .

No CUDA runtime is found, using CUDA_HOME='/usr/local/cuda'
CUDA extension not installed.
CUDA extension not installed.
2024-05-16 23:04:49,632 - INFO - run_localGPT.py:244 - Running on: cuda
2024-05-16 23:04:49,632 - INFO - run_localGPT.py:245 - Display Source Documents set to: False
2024-05-16 23:04:49,632 - INFO - run_localGPT.py:246 - Use history set to: False
2024-05-16 23:04:49,874 - INFO - SentenceTransformer.py:66 - Load pretrained SentenceTransformer: hkunlp/instructor-large
load INSTRUCTOR_Transformer
max_seq_length  512
2024-05-16 23:04:51,217 - INFO - run_localGPT.py:132 - Loaded embeddings from hkunlp/instructor-large
Here is the prompt used: input_variables=['context', 'question'] output_parser=None partial_variables={} template='<|begin_of_text|><|start_header_id|>system<|end_header_id|> You are a helpful assistant, you will use the provided context to answer user questions.\nRead the given context before answering questions and think step by step. If you can not answer a user question based on \nthe provided context, inform the user. Do not use any other information for answering user. Provide a detailed answer to the question.<|eot_id|><|start_header_id|>user<|end_header_id|>\n            Context: {context}\n            User: {question}<|start_header_id|>assistant<|end_header_id|>' template_format='f-string' validate_template=True
2024-05-16 23:04:51,262 - INFO - run_localGPT.py:60 - Loading Model: meta-llama/Meta-Llama-3-8B-Instruct, on: cuda
2024-05-16 23:04:51,263 - INFO - run_localGPT.py:61 - This action can take a few minutes!
2024-05-16 23:04:51,263 - INFO - load_models.py:153 - Using AutoModelForCausalLM for full models
/usr/local/lib/python3.10/dist-packages/huggingface_hub/file_download.py:1132: FutureWarning: `resume_download` is deprecated and will be removed in version 1.0.0. Downloads always resume when possible. If you want to force a new download, use `force_download=True`.
  warnings.warn(
Traceback (most recent call last):
  File "/usr/local/lib/python3.10/dist-packages/huggingface_hub/utils/_errors.py", line 304, in hf_raise_for_status
    response.raise_for_status()
  File "/usr/local/lib/python3.10/dist-packages/requests/models.py", line 1021, in raise_for_status
    raise HTTPError(http_error_msg, response=self)
requests.exceptions.HTTPError: 401 Client Error: Unauthorized for url: https://huggingface.co/meta-llama/Meta-Llama-3-8B-Instruct/resolve/main/config.json

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "/usr/local/lib/python3.10/dist-packages/transformers/utils/hub.py", line 398, in cached_file
    resolved_file = hf_hub_download(
  File "/usr/local/lib/python3.10/dist-packages/huggingface_hub/utils/_validators.py", line 114, in _inner_fn
    return fn(*args, **kwargs)
  File "/usr/local/lib/python3.10/dist-packages/huggingface_hub/file_download.py", line 1221, in hf_hub_download
    return _hf_hub_download_to_cache_dir(
  File "/usr/local/lib/python3.10/dist-packages/huggingface_hub/file_download.py", line 1325, in _hf_hub_download_to_cache_dir
    _raise_on_head_call_error(head_call_error, force_download, local_files_only)
  File "/usr/local/lib/python3.10/dist-packages/huggingface_hub/file_download.py", line 1823, in _raise_on_head_call_error
    raise head_call_error
  File "/usr/local/lib/python3.10/dist-packages/huggingface_hub/file_download.py", line 1722, in _get_metadata_or_catch_error
    metadata = get_hf_file_metadata(url=url, proxies=proxies, timeout=etag_timeout, headers=headers)
  File "/usr/local/lib/python3.10/dist-packages/huggingface_hub/utils/_validators.py", line 114, in _inner_fn
    return fn(*args, **kwargs)
  File "/usr/local/lib/python3.10/dist-packages/huggingface_hub/file_download.py", line 1645, in get_hf_file_metadata
    r = _request_wrapper(
  File "/usr/local/lib/python3.10/dist-packages/huggingface_hub/file_download.py", line 372, in _request_wrapper
    response = _request_wrapper(
  File "/usr/local/lib/python3.10/dist-packages/huggingface_hub/file_download.py", line 396, in _request_wrapper
    hf_raise_for_status(response)
  File "/usr/local/lib/python3.10/dist-packages/huggingface_hub/utils/_errors.py", line 321, in hf_raise_for_status
    raise GatedRepoError(message, response) from e
huggingface_hub.utils._errors.GatedRepoError: 401 Client Error. (Request ID: Root=1-66469113-262d9507724414dd5cfb44fe;6fba667e-d2e6-408d-b6c7-9b2467267c52)

Cannot access gated repo for url https://huggingface.co/meta-llama/Meta-Llama-3-8B-Instruct/resolve/main/config.json.
Access to model meta-llama/Meta-Llama-3-8B-Instruct is restricted. You must be authenticated to access it.

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "//run_localGPT.py", line 285, in <module>
    main()
  File "/usr/local/lib/python3.10/dist-packages/click/core.py", line 1157, in __call__
    return self.main(*args, **kwargs)
  File "/usr/local/lib/python3.10/dist-packages/click/core.py", line 1078, in main
    rv = self.invoke(ctx)
  File "/usr/local/lib/python3.10/dist-packages/click/core.py", line 1434, in invoke
    return ctx.invoke(self.callback, **ctx.params)
  File "/usr/local/lib/python3.10/dist-packages/click/core.py", line 783, in invoke
    return __callback(*args, **kwargs)
  File "//run_localGPT.py", line 252, in main
    qa = retrieval_qa_pipline(device_type, use_history, promptTemplate_type=model_type)
  File "//run_localGPT.py", line 142, in retrieval_qa_pipline
    llm = load_model(device_type, model_id=MODEL_ID, model_basename=MODEL_BASENAME, LOGGING=logging)
  File "//run_localGPT.py", line 74, in load_model
    model, tokenizer = load_full_model(model_id, model_basename, device_type, LOGGING)
  File "/load_models.py", line 154, in load_full_model
    tokenizer = AutoTokenizer.from_pretrained(model_id, cache_dir="./models/")
  File "/usr/local/lib/python3.10/dist-packages/transformers/models/auto/tokenization_auto.py", line 819, in from_pretrained
    config = AutoConfig.from_pretrained(
  File "/usr/local/lib/python3.10/dist-packages/transformers/models/auto/configuration_auto.py", line 928, in from_pretrained
    config_dict, unused_kwargs = PretrainedConfig.get_config_dict(pretrained_model_name_or_path, **kwargs)
  File "/usr/local/lib/python3.10/dist-packages/transformers/configuration_utils.py", line 631, in get_config_dict
    config_dict, kwargs = cls._get_config_dict(pretrained_model_name_or_path, **kwargs)
  File "/usr/local/lib/python3.10/dist-packages/transformers/configuration_utils.py", line 686, in _get_config_dict
    resolved_config_file = cached_file(
  File "/usr/local/lib/python3.10/dist-packages/transformers/utils/hub.py", line 416, in cached_file
    raise EnvironmentError(
OSError: You are trying to access a gated repo.
Make sure to have access to it at https://huggingface.co/meta-llama/Meta-Llama-3-8B-Instruct.
401 Client Error. (Request ID: Root=1-66469113-262d9507724414dd5cfb44fe;6fba667e-d2e6-408d-b6c7-9b2467267c52)

Cannot access gated repo for url https://huggingface.co/meta-llama/Meta-Llama-3-8B-Instruct/resolve/main/config.json.
Access to model meta-llama/Meta-Llama-3-8B-Instruct is restricted. You must be authenticated to access it.
PromtEngineer / localGPT

How to authenticate to huggingface.co, from the run_localGPT.py script, using Docker? #797