facebookresearch / seamless_communication

Foundational Models for State-of-the-Art Speech and Text Translation
Other
10.94k stars 1.06k forks source link

fairseq2.assets.metadata_provider.AssetNotFoundError: An asset with the name '/Models/seamlessM4T_v2_large.pt' cannot be found. #364

Closed dawit3228 closed 8 months ago

dawit3228 commented 9 months ago

i have downloaded the model both seamlessM4T_v2_large and vocoder_v2 and now i want to use the the downloaded path instead of downloading and saving it in cache memory i need your so help so bad

Note : Yes i have tried download in cache memory and it works the only problem am having when i run the code i provided down there i gave it the right directory where seamlessM4T_v2_large and vocoder_v2 are but am getting this error.

error : (real_api) david@davemoment:~/Documents/Projects/real1/src$ python STT.py Traceback (most recent call last): File "/home/david/Documents/Projects/real1/src/STT.py", line 22, in translator = Translator( File "/home/david/Documents/Projects/real_api/lib/python3.10/site-packages/seamless_communication/inference/translator.py", line 93, in init model_name_or_card = asset_store.retrieve_card(model_name_or_card) File "/home/david/Documents/Projects/real_api/lib/python3.10/site-packages/fairseq2/assets/store.py", line 60, in retrieve_card return self._do_retrieve_card(name, envs) File "/home/david/Documents/Projects/real_api/lib/python3.10/site-packages/fairseq2/assets/store.py", line 77, in _do_retrieve_card metadata = self._get_metadata(name) File "/home/david/Documents/Projects/real_api/lib/python3.10/site-packages/fairseq2/assets/store.py", line 123, in _get_metadata raise AssetNotFoundError(f"An asset with the name '{name}' cannot be found.") fairseq2.assets.metadata_provider.AssetNotFoundError: An asset with the name '/Models/seamlessM4T_v2_large.pt' cannot be found.

here is the code python

import io
import json
import matplotlib as mpl
import matplotlib.pyplot as plt
import mmap
import numpy
import soundfile
import torchaudio
import torch

from collections import defaultdict
from IPython.display import Audio, display
from pathlib import Path
from pydub import AudioSegment

from seamless_communication.inference import Translator
from seamless_communication.streaming.dataloaders.s2tt import SileroVADSilenceRemover

model_path = "/Models/seamlessM4T_v2_large.pt"
vocoder_path = "/Models/vocoder_v2.pt"

translator = Translator(
    model_path,
    vocoder_path,
    device=torch.device("cuda:0"),
    dtype=torch.float16,
)

tgt_langs = ("eng", )

def translate_audio(input_file, tgt_langs):
    translation_results = {}
    for tgt_lang in tgt_langs:
        text_output, _ = translator.predict(
            input=input_file,
            task_str="s2tt",
            tgt_lang=tgt_lang,
        )
        translation_results[tgt_lang] = text_output[0]
    return translation_results