Syncing impossible due to merging of lines across timestamps

bwvanlith commented 7 months ago

I've tried several different srt files. No errors, but lines from different timestamps get merged. Due to this I find it's impossible to correctly sync the subtitles to the movie.

The resulting srt file has less lines (~300 on 1500 lines) and is impossible to correctly sync.

First job doesn't return an error, the next gives: Error: failed to pipe reponse

[cause]: TypeError: Cannot read properties of undefined (reading 'timestamp') at Object.start (webpack-internal:///(rsc)/./app/api/route.ts:70:59) at process.processTicksAndRejections (node:internal/process/task_queues:95:5)

GGMaia commented 7 months ago

Same problem. The text is translated correctly, but the order is incorrect and sometimes I receive this update and the file is not downloaded:

Error: failed to pipe response at pipeToNodeResponse (C:\Users\six_w\Documents\srt-ai-main\node_modules\next\dist\server\pipe-readable.js:111:15) at process.processTicksAndRejections (node:internal/process/task_queues:95:5) at async sendResponse (C:\Users\six_w\Documents\srt-ai-main\node_modules\next\dist\server\send-response.js:40:13) at async doRender (C:\Users\six_w\Documents\srt-ai-main\node_modules\next\dist\server\base-server.js:1360:25) at async cacheEntry.responseCache.get.routeKind (C:\Users\six_w\Documents\srt-ai-main\node_modules\next\dist\server\base-server.js:1552:28) at async DevServer.renderToResponseWithComponentsImpl (C:\Users\six_w\Documents\srt-ai-main\node_modules\next\dist\server\base-server.js:1460:28) at async DevServer.renderPageComponent (C:\Users\six_w\Documents\srt-ai-main\node_modules\next\dist\server\base-server.js:1843:24) at async DevServer.renderToResponseImpl (C:\Users\six_w\Documents\srt-ai-main\node_modules\next\dist\server\base-server.js:1881:32) at async DevServer.pipeImpl (C:\Users\six_w\Documents\srt-ai-main\node_modules\next\dist\server\base-server.js:909:25) at async NextNodeServer.handleCatchallRenderRequest (C:\Users\six_w\Documents\srt-ai-main\node_modules\next\dist\server\next-server.js:266:17) at async DevServer.handleRequestImpl (C:\Users\six_w\Documents\srt-ai-main\node_modules\next\dist\server\base-server.js:805:17) { [cause]: TypeError: Cannot read properties of undefined (reading 'timestamp') at Object.start (webpack-internal:///(rsc)/./app/api/route.ts:70:59) at process.processTicksAndRejections (node:internal/process/task_queues:95:5) }

yazinsai commented 7 months ago

Thanks for highlighting this issue!

I dug into it, and it appears to be an issue with gpt-3.5-turbo-1106. It would sporadically merge two segments into one, or split a single segment into two. I was able to resolve this by using gpt-4, but at wayyyy slower translation times.

It doesn't fail predictably - the same .srt file would fail at different points in different runs, so I'm almost certain this is a GPT issue and not parsing/etc.

I'll update the code to use gpt-4-turbo-0125, the latest model as of today, in order to fix this issue - but just know that it'll be way slower than 3.5.

bwvanlith commented 7 months ago

Thanks for your time and hard work! Really cool :)

yazinsai / srt-ai

Syncing impossible due to merging of lines across timestamps #20