Open srijanb97 opened 4 years ago
The text alignment tool works perfectly with pretty much accuracy. However, I have audio files where there is more than one speaker. Is there any way to detect which speaker is uttering each word?
That's not really what gentle is made for, but there are other tools you can use.
@srijanb97 look into "diarization". NeMo seems to be pretty good, but there are others (e.g., pyannote)
NeMo
pyannote
The text alignment tool works perfectly with pretty much accuracy. However, I have audio files where there is more than one speaker. Is there any way to detect which speaker is uttering each word?