Open nexuslux opened 4 months ago
This token is deactivated. You can use your own token.
have changed the HF token to my own in the /cli/transcribe.py file...
And used the example code: python -m pyannote_whisper.cli.transcribe data/afjiv.wav --model tiny --diarization True
Still doesn't work? Am i missing something?
import whisper
from pyannote.audio import Pipeline
from pyannote_whisper.utils import diarize_text
pipeline = Pipeline.from_pretrained("pyannote/speaker-diarization",
use_auth_token="hf_xxxxx -replace with my own")
model = whisper.load_model("tiny.en")
asr_result = model.transcribe("data/afjiv.wav")
diarization_result = pipeline("data/afjiv.wav")
final_result = diarize_text(asr_result, diarization_result)
for seg, spk, sent in final_result:
line = f'{seg.start:.2f} {seg.end:.2f} {spk} {sent}'
print(line)
The code in the readme also doesn't work.
@nexuslux Have you affirmed access through Huggingface repositories? You'll need to agree to the terms for each of the repositories pyannote uses. That would be pyannote/segmentation and pyannote/speaker-diarization.
Seems that it only performs the transcription and no longer diarization. See below is based on the shared example file (of which the repo is sitll using yinruiqing's HF token - as poined out by Jordi in another thread) 太可怕~