Open mirix opened 11 months ago
The following gives an assertion error decoding the transcript.
File "/home/emoman/Downloads/nemo/lib/python3.8/site-packages/speechbox/restore.py", line 79, in __call__
assert (
AssertionError: Decoding of
from speechbox import PunctuationRestorer
import librosa
import whisper
device = 'cuda'
model_size = 'large-v2'
file_path = 'wav_file.wav'
modelw = whisper.load_model(model_size, device=device)
modelw.to(device)
### Transcription ###
result = modelw.transcribe(file_path, beam_size=5, word_timestamps=True)
### Sentence splitting ###
word_list = []
for segment in result['segments']:
for word in segment['words']:
word_list.append(word['word'])
full_text = ''.join([str(i) for i in word_list])
### Punctuation ###
audio_data, sample_rate = librosa.load(file_path)
restorer = PunctuationRestorer.from_pretrained('openai/whisper-large-v2')
restorer.to(device)
restored_text, log_probs = restorer(audio_data, full_text, sampling_rate=sample_rate, num_beams=5)
print('Restored text:\n', restored_text)
Hello,
Is it possible to use the punctuation restoration function on a pre-existing transcript and a wav audio file?
Is so, how?
Best,
Ed