yazinsai / srt-ai

Translate SRT files to any language, using AI magic ✨
https://translatesrt.com/
193 stars 50 forks source link

persistent sync problem #21

Open GGMaia opened 7 months ago

GGMaia commented 7 months ago

I am reporting that even after updating to GPT-4, the synchronization error in the speeches remains similar to what happened before (At least when I try to translate into my language, which is Portuguese). What it seems is that ChatGPT tends to eat some lines but keep the timestamps, which causes the rest of the lines in the entire file to be extremely out of sync. I don't know if this is a fixable problem, since ChatGPT is the one making the mistake, not your code.

The original text: image

the translated one: image

yazinsai commented 7 months ago

hey @NeroQuill, thanks for the detailed report. I'm going to be adding a validation step to ensure output segments from GPT-4 always match the number of input segments.

GGMaia commented 7 months ago

Thanks for your response. I tested it on 2 different files to try to do the translation and both show the synchronization error always at a similar timestamp. Here's a side-by-side comparison of the exact timestamp that ChatGPT "eats" a line:

image

I wanted to know if this error only happens to me or if it is happening to more people in other languages or even in Portuguese. I say this because there are some factors that may be interfering with ChatGPT to do the translation correctly without eating lines, which is the fact that I take an original .ASS file and transform it into an .SRT file, and then I delete around 500 initial lines of the subtitle, as they are the opening of the episode (I'm translating One Pace into Portuguese, which is a summary project for the One Piece anime that only has English subtitles). Other than that, I don't change anything else in the file that appears to maintain a reliable structure of an original .SRT file.

Thank you in advance for your work, which is brilliant and very important for the subtitles niche in the world. If you can make this work completely, it will be a perfect job.

yamandesign commented 6 months ago

I have the same issue. It always skips some lines so the order number and sentences are not correct. How can we fix it?

yazinsai commented 6 months ago

I'm working on a fix, will keep this post updated

yazinsai commented 6 months ago

Streamed my attempt here: https://www.youtube.com/live/ScnHkYKvtRE

Made some progress by converting the response to JSON, but it still occasionally skips/merges some lines! 🫤

yamandesign commented 6 months ago

Is there any update on this issue?

Sptzzz commented 5 months ago

Having the same issue. Seems to not be fixed so it's not reliable for day to day use atm :(