Closed vladgrand2 closed 5 months ago
Need help on this as well, it will be good if the result can be converted to json rather than txt file
Can you post an example JSON so I can replicate the scheme?
test (whisperx_output).json test(after script).json
I write temporary code to convert srt files from diarize.py to json like a whisper which I need for work. further injecting.
import json
import sys
def convert_srt_to_json(srt_file):
segments = []
speaker = ''
words = [{}]
with open(srt_file, 'r', encoding='utf-8') as file:
lines = file.readlines()
for i in range(len(lines)):
line = lines[i].strip()
if line.isdigit():
start, end = lines[i+1].strip().split(' --> ')
text = lines[i+2].strip()
# Проверяем, содержит ли текст "SPEAKER_01:" или "SPEAKER_00:"
if 'Speaker 1:' in text:
speaker = 'SPEAKER_01'
text = text.replace('Speaker 1:', '').strip()
elif 'Speaker 0:' in text:
speaker = 'SPEAKER_00'
text = text.replace('Speaker 0:', '').strip()
def convert_time_to_seconds(time):
h, m, s = time.split(':')
s, ms = s.split(',')
seconds = int(h) * 3600 + int(m) * 60 + int(s) + int(ms) / 1000
return seconds
start_seconds = convert_time_to_seconds(start)
end_seconds = convert_time_to_seconds(end)
segments.append({
'start': start_seconds,
'end': end_seconds,
'text': text,
'words': words,
'speaker': speaker
})
json_data = {
'segments': segments
}
json_file = srt_file.replace('.srt', '.json')
with open(json_file, 'w', encoding='utf-8') as file:
json.dump(json_data, file, indent=4, ensure_ascii=False)
print(f"Successfully converted SRT to JSON: {json_file}")
srt_file = sys.argv[1]
convert_srt_to_json(srt_file)
also I reccoment to add to diarize.py:
parser.add_argument(
"--language",
dest="language",
default=None,
help="language spoken in the audio, specify None to perform language detection",
)
This is need for correct language transcribation. by default whisperx trying to detect language and in many cases ruins transcribation.
Also I noticed that in Windows diarize.py don't want to work with large-v3 but in Linux everything ok! Original whisperx work with large-v3 after update fine in Windows.
It is possible to add to diarization.py parametr to get json file output with speakers like in whisperx?