Dadangdut33 / Speech-Translate

A realtime speech transcription and translation application using Whisper OpenAI and free translation API. Interface made using Tkinter. Code written fully in Python.
MIT License
436 stars 55 forks source link

fix translate with timestamps #20

Closed ryanhe312 closed 1 year ago

ryanhe312 commented 1 year ago

Very nice project! It helps me a lot when making subtitles. However, there is a minor problem with the translation of non-English languages.

Problem:

Translating SRT with timestamps will bring two problems.

1. Cause wrong SRT format with indexes

e.g. when translating to Chinese, "个" and "四" will spoil the format

1个
00:00:00,000 --> 00:00:02,000
……

2个
00:00:06,400 --> 00:00:08,400
已经是时候了...

3个
00:00:10,080 --> 00:00:13,760
做早餐的时候发现晚了。。。

四
00:00:16,000 --> 00:00:18,000
我还没醒...

2. Redundant api request for lines with time stamps

Fix:

Convert SRT to text before translation and save the timestamps for a later recovering.

Dadangdut33 commented 1 year ago

Nice, thanks for the help!