Closed gabrilator closed 1 year ago
Hey @gabrilator! Thanks for opening this issue, awesome to have you hear!
That's a great question, since we use the Hugging Face ASR pipeline as our backend, we can simply pass the path to our audio file as the audio input:
import torch
from speechbox import ASRDiarizationPipeline
from datasets import load_dataset
device = "cuda:0" if torch.cuda.is_available() else "cpu"
pipeline = ASRDiarizationPipeline.from_pretrained("openai/whisper-tiny", device=device)
path_to_audio = "path/to/audio/file" # fill me!
out = pipeline(path_to_audio)
print(out)
See the ASR pipeline docs for more details 🤗
Thanks Sanchit! Keep up the amazing work, I'm mind-blown by you guys!!
Hey! First of all, thanks for all the amazing work.
I am trying to get the diarization to work with custom audio samples (i.e audio.mp3 or audio.wav files), and I would like to know how to load them before calling the pipeline.
In particular, I'd like to substitute this sample with my own files:
concatenated_librispeech = load_dataset("sanchit-gandhi/concatenated_librispeech", split="train", streaming=True) sample = next(iter(concatenated_librispeech))
Sorry about my ignorance, I'm very used to NodeJS and finding it challenging to follow everything!