Given that whisper-ts has some very nice improvements regarding the precision of timestamp alignments you might want to consider to replace the current whisper with whisper-ts - I tested it and it only requires a few lines of code to change:
import stable_whisper as whisper
[...]
#old:
#result = model.transcribe(audio_save_path,word_timestamps=True)
#new:
result = model.transcribe_minimal(audio_save_path,word_timestamps=True)
result = model.align(audio_save_path, result, language=result.language).to_dict()
Given that whisper-ts has some very nice improvements regarding the precision of timestamp alignments you might want to consider to replace the current whisper with whisper-ts - I tested it and it only requires a few lines of code to change: