It seems GPT will combine some lines together

B1gM8c commented 1 year ago

Hello, I used the script you provided. Is there a way to optimize the prompt words? It seems that the content returned by GPT will merge closely related text in the SRT file into one line, causing a shortage of line numbers.

this is the example, having the issues about Missing 5 lines:

Original:
[00:04:56,300 --> 00:04:57,300] Is it worse?
[00:04:57,300 --> 00:05:01,660] I have a few idea on how I can test this, and I think I'm gonna try by using the example
[00:05:01,660 --> 00:05:03,740] provided on the website.
[00:05:03,740 --> 00:05:08,620] Because the first example they provided is to explain the plot of Cinderella in a sentence
[00:05:08,620 --> 00:05:14,060] where each word has to begin with the next letter in the alphabet from A to Z, without
[00:05:14,060 --> 00:05:15,860] repeating any letters.
[00:05:15,860 --> 00:05:21,740] And if you remember correctly, from the live demo, we saw that GPT-3.5 has a lot of trouble
[00:05:21,740 --> 00:05:24,740] with this very simple example.
[00:05:24,740 --> 00:05:27,100] So I'm hoping that GPT-4 gets better result.
[00:05:27,100 --> 00:05:30,540] So I'm just gonna select the example, CTRL-C to copy it.

Translated:
[00:04:56,300 --> 00:04:57,300] 更糟了吗？
[00:04:57,300 --> 00:05:01,660] 我有几个测试想法，我想我会尝试使用网站上提供的例子。
[00:05:01,660 --> 00:05:03,740] 因为他们提供的第一个例子是用一句话解释灰姑娘的情节，
[00:05:03,740 --> 00:05:08,620] 每个单词必须从A到Z按顺序开始，不重复任何字母。
[00:05:08,620 --> 00:05:14,060] 如果你还记得的话，从现场演示中，我们看到GPT-3.5在这个非常简单的例子中遇到了很多困难。
[00:05:14,060 --> 00:05:15,860] 所以我希望GPT-4能得到更好的结果。
[00:05:15,860 --> 00:05:21,740] 所以我只是选择例子，按CTRL-C复制它。

video     :  29%|███████████▍                            | 8/28 [03:57<09:22, 28.13s/it]Missing 5 lines

gkovacsp commented 1 year ago

You are right, it does it ever now and then and I could not get around it. This is the reason I send in only 10 lines for conversion, so the sift caused by the missing line will fix itself. Otherwise 40-50 lines could be translated in one go.

In my experience it happens only once out of 10 rounds, so not that disturbing. Maybe in Chinese it merges more line.

I've tried to phrase the prompt many different ways without any success.

Let me know if you can figure out how to do it.

gkovacsp commented 1 year ago

Apart from the shift, how is the quality of the translation?

B1gM8c commented 1 year ago

Apart from the shift, how is the quality of the translation?

I think the performance of translating sentence by sentence is better than some other machine translation tools, as it has a certain contextual foundation. However, if GPT could read the entire SRT content before translating, the results would be even better. Currently, I feel that the prompts could be optimized.

gkovacsp commented 1 year ago

It does read the context of the lines send together, unfortunately the number of lines sent together must be optmimized to avoid shifts. More-lines more context more shifts...

I've played with the Norwegian translation and there were only 3 shifts during a 40 minutes episode. Hungarian results in 5-10 shifts.

B1gM8c commented 1 year ago

I found another project that is simliar to this

https://github.com/gnehs/subtitle-translator-electron/blob/main/src/components/translator.tsx

It provided a pretty good prompt

        {
          role: "system",
          content: `You are a program responsible for translating subtitles. Your task is to output the specified target language based on the input text. Please do not create the following subtitles on your own. Please do not output any text other than the translation. You will receive the subtitles as array that needs to be translated, as well as the previous translation results and next subtitle. If you need to merge the subtitles with the following line, simply repeat the translation. Please transliterate the person's name into the local language. Target language: ${targetLanguage}\n\n${additionalNotes}`
        },

However I didn't think that's the prompt cause the good result.

The translation method they use: Simply put it translates one subtitle line at a time and takes the previous 4 lines with it, this ensures that only one subtitle is returned at a time but that the result is accurate without the worry of ChatGPT taking it upon itself to merge the subtitles for you.

gkovacsp commented 1 year ago

I'm thinking about analyzing the length of the returned lines and then add an empty block if two lines were merged. Based on my visual checks it might be possible, probably could even prompt the user to decide what should happen. The prompt matters a lot, I might try the strictness of the referred project.

This might make a little difference: "If you need to merge the subtitles with the following line, simply repeat the translation"

gkovacsp commented 1 year ago

I've tested the modified prompt and it does have some effect: it fixes the missing line when a clear new sentence starts in the text - it skips one of the timestamps and re-aligns with the original intent. It might allow to send in longer slices.

gkovacsp commented 1 year ago

Check the latest version. I tested it with Norwegian--> English and English--> Hungarian, only misses a few lines and fixes the shift within the next 2-3 subtitles. Best result so-far.

It also finds the end of sentences before sending them in, so there is not half sentence at the beginning or at the end of the slice when translating.

B1gM8c commented 1 year ago

Maybe there's some small problem and it will jump out of the code

video     :   0%|                                        | 0/249 [00:00<?, ?it/s]Missing 2 line(s)
video     :   6%|██▏                                     | 14/249 [00:18<05:04,  1.30s/it]Missing 1 line(s)
video     :  15%|██████                                  | 38/249 [00:50<04:41,  1.33s/it]Missing 1 line(s)
video     :  20%|████████                                | 50/249 [01:04<04:14,  1.28s/it]Missing 1 line(s)
video     :  35%|█████████████▉                          | 87/249 [01:47<03:12,  1.19s/it]Unsuccesful OpenAI operation. Error: HTTP code 502 from API (<html>
<head><title>502 Bad Gateway</title></head>
<body>
<center><h1>502 Bad Gateway</h1></center>
<hr><center>nginx</center>
</body>
</html>
)
Error during translation, wating for 10 sec to overcome rate limitation...
Trying again...
Missing 1 line(s)
video     :  40%|███████████████▉                        | 99/249 [07:34<24:38,  9.86s/it]Missing 4 line(s)
video     :  45%|█████████████████▊                      | 111/249 [07:48<16:29,  7.17s/it]Missing 1 line(s)
video     :  50%|███████████████████▉                    | 124/249 [08:03<10:53,  5.23s/it]Missing 2 line(s)
video     :  55%|██████████████████████                  | 137/249 [08:18<07:22,  3.95s/it]Missing 3 line(s)
video     :  80%|███████████████████████████████▊        | 198/249 [09:26<01:21,  1.61s/it]Missing 1 line(s)
video     : |                                        | 251/? [10:22<00:00,  2.48s/it]

I was wondering if the api key was on rate limitation

B1gM8c commented 1 year ago

Maybe I should try it later.

gkovacsp commented 1 year ago

If you happen to chat with chatgpt while running this you might end up in conflict, because you are allowed to have only one session.

gkovacsp commented 1 year ago

I've improved logging and messages.

There are always a few lines merged during translation but it straightens itself out after 2-3 additional subtitles. I think this is the best it can do, it needs a file compare after the end and those can be easily fixed manually.

The quality of the text is pretty good when translating to Hungarian. The mis-translations are usually from the context, when literally it is correct but looking at the video there could be other meanings as well.

ddkwing commented 1 year ago

I have encounted the issue either... I thought it shift caused by the subsentences... So I have a idea but not have a try. If we use langchain to make the task into two subtask, one is merge sentences, the other one in charge of translate. Maybe can solved the issue?

gkovacsp commented 1 year ago

Might worth a try, let me know if it works and we can tinker the script.

gkovacsp / gpt_srt_translator

It seems GPT will combine some lines together #1