Open gangagyatso4364 opened 1 month ago
The function used to get time span from stt file_name:
def get_time_span(filename):
filename = filename.replace(".wav", "")
filename = filename.replace(".WAV", "")
filename = filename.replace(".mp3", "")
filename = filename.replace(".MP3", "")
try:
if "_to_" in filename:
start, end = filename.split("_to_")
start = start.split("_")[-1]
end = end.split("_")[0]
end = float(end)
start = float(start)
return (end - start) / 1000
else:
start, end = filename.split("-")
start = start.split("_")[-1]
end = end.split("_")[0]
end = float(end)
start = float(start)
return abs(end - start)
except Exception as err:
print(f"filename is:'{filename}'. Could not parse to get time span.")
return 0
Description :
We need to create conversation using Speaker diarisation and existing STT datas time stamps. from NS audios. Use existing speaker diarisation model from pyannote.audio: model i expect an output that is a json file:
Implementation:
conversation_id
, the participants, and the dialogue.Subtasks:
Transcription Alignment:
Data Structuring:
JSON Output Generation: