ictnlp / LLaMA-Omni

LLaMA-Omni is a low-latency and high-quality end-to-end speech interaction model built upon Llama-3.1-8B-Instruct, aiming to achieve speech capabilities at the GPT-4o level.
https://arxiv.org/abs/2409.06666
Apache License 2.0
2.62k stars 177 forks source link

Getting this model error, when launching a model worker #10

Open hp2413 opened 2 months ago

hp2413 commented 2 months ago

llama-omni) Ubuntu@0008-dsm-prxmx30009:~/TestTwo/LLaMA-Omni$ python -m omni_speech.serve.model_worker --host 0.0.0.0 --controller http://localhost:10000/ --port 40000 --worker http://localhost:40000/ --model-path Llama-3.1-8B-Omni --model-name Llama-3.1-8B-Omni --s2s 2024-09-14 00:03:16 | INFO | model_worker | args: Namespace(host='0.0.0.0', port=40000, worker_address='http://localhost:40000/', controller_address='http://localhost:10000/', model_path='Llama-3.1-8B-Omni', model_base=None, model_name='Llama-3.1-8B-Omni', device='cuda', limit_model_concurrency=5, stream_interval=1, no_register=False, load_8bit=False, load_4bit=False, use_flash_attn=False, input_type='mel', mel_size=128, s2s=True, is_lora=False) 2024-09-14 00:03:16 | ERROR | stderr | Traceback (most recent call last): 2024-09-14 00:03:16 | ERROR | stderr | File "/home/Ubuntu/.conda/envs/llama-omni/lib/python3.10/site-packages/huggingface_hub/utils/_errors.py", line 304, in hf_raise_for_status 2024-09-14 00:03:16 | ERROR | stderr | response.raise_for_status() 2024-09-14 00:03:16 | ERROR | stderr | File "/home/Ubuntu/.conda/envs/llama-omni/lib/python3.10/site-packages/requests/models.py", line 1024, in raise_for_status 2024-09-14 00:03:16 | ERROR | stderr | raise HTTPError(http_error_msg, response=self) 2024-09-14 00:03:16 | ERROR | stderr | requests.exceptions.HTTPError: 401 Client Error: Unauthorized for url: https://huggingface.co/Llama-3.1-8B-Omni/resolve/main/tokenizer_config.json 2024-09-14 00:03:16 | ERROR | stderr | 2024-09-14 00:03:16 | ERROR | stderr | The above exception was the direct cause of the following exception: 2024-09-14 00:03:16 | ERROR | stderr |

@Poeroz can you please help me with this issue, thanks

hp2413 commented 2 months ago

When I tried this: curl -H "Authorization: Bearer hf_Key” -O https://huggingface.co/Llama-3.1-8B-Omni/resolve/main/tokenizer_config.json I got this reply: "Repository not found"

Also I am not able to understand what exactly we have to do in the

Quick Start 1)Download the Llama-3.1-8B-Omni model from 🤗Huggingface.

When try to access the page to download the modela, I am getting into the current main page. can you please explain this steps, thanks

Poeroz commented 2 months ago

It seems that the issue is due to an incorrect path. You can download the model locally to the Llama-3.1-8B-Omni path in advance and then load the model. If you prefer to download the model automatically, you may need to change the model name from Llama-3.1-8B-Omni to ICTNLP/Llama-3.1-8B-Omni.

hp2413 commented 2 months ago

Thank you for reply, I can able to download the valid files now, But where do we place the model Llama-3.1-8B-Omni locally? at what path? @Poeroz

ayeganov commented 2 months ago

I have the same questions, but in addition the model is no longer downloadable with the transformers library. I used the following code to get the model:

from transformers import AutoModel

model_name = "ICTNLP/Llama-3.1-8B-Omni"
model = AutoModel.from_pretrained(model_name)

And it resulted in this error:

Traceback (most recent call last):
  File "/home/ayeganov/anaconda3/envs/llama-omni/lib/python3.10/site-packages/transformers/models/auto/configuration_auto.py", line 989, in from_pretrained
    config_class = CONFIG_MAPPING[config_dict["model_type"]]
  File "/home/ayeganov/anaconda3/envs/llama-omni/lib/python3.10/site-packages/transformers/models/auto/configuration_auto.py", line 691, in __getitem__
    raise KeyError(key)
KeyError: 'omni_speech2s_llama'

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/tmp/omni_speech_test/./test.py", line 4, in <module>
    model = AutoModel.from_pretrained(model_name)
  File "/home/ayeganov/anaconda3/envs/llama-omni/lib/python3.10/site-packages/transformers/models/auto/auto_factory.py", line 524, in from_pretrained
    config, kwargs = AutoConfig.from_pretrained(
  File "/home/ayeganov/anaconda3/envs/llama-omni/lib/python3.10/site-packages/transformers/models/auto/configuration_auto.py", line 991, in from_pretrained
    raise ValueError(
ValueError: The checkpoint you are trying to load has model type `omni_speech2s_llama` but Transformers does not recognize this architecture. This could be because of an issue with the checkpoint, or because your version of Transformers is out of date.
bekinsmingo commented 2 months ago

Since this model is not currently merged with transformers, it must be loaded using the library's own class. Loading was successful in the following manner.


model_name = 'ICTNLP/Llama-3.1-8B-Omni'
from omni_speech.model.language_model.omni_speech_llama import OmniSpeechLlamaForCausalLM
model = OmniSpeechLlamaForCausalLM.from_pretrained(model_name)