Hello, I hope you are doing well. I just wanna use your framework and fine-tuned model for a simple Arabic Audio to text conversion.
I've Downloaded MGB2_ASR.pt and asr_spm.model, but not sure how to use them? is below code work fine?what should be the 'path-to-folder-with-checkpoints'?
import torch
from artst.tasks.artst import ArTSTTask
from artst.models.artst import ArTSTTransformerModel
Hello, I hope you are doing well. I just wanna use your framework and fine-tuned model for a simple Arabic Audio to text conversion. I've Downloaded MGB2_ASR.pt and asr_spm.model, but not sure how to use them? is below code work fine?what should be the 'path-to-folder-with-checkpoints'?
import torch from artst.tasks.artst import ArTSTTask from artst.models.artst import ArTSTTransformerModel
Load the checkpoint
checkpoint = torch.load('MGB2_ASR.pt') checkpoint['cfg']['task'].t5_task = 's2t'
checkpoint['cfg']['task'].data = 'path-to-folder-with-checkpoints'
task = ArTSTTask.setup_task(checkpoint['cfg']['task']) model = ArTSTTransformerModel.build_model(checkpoint['cfg']['model'], task) model.load_state_dict(checkpoint['model'])
import librosa audio_input, sample_rate = librosa.load("path_to_your_audio_file.wav", sr=16000)
from transformers import Wav2Vec2Processor processor = Wav2Vec2Processor.from_pretrained("path_to_your_local_processor") input_values = processor(audio_input, return_tensors="pt", sampling_rate=16000).input_values