3b1b / captions

transcripts and captions for 3blue1brown videos
233 stars 170 forks source link

Voice dub length issue #212

Closed FaboBence closed 7 months ago

FaboBence commented 7 months ago

I took a look at the newly created translate.3blue1brown.com page to check my previous translations. The site highlights sentences red, when "its estimated to be too long to fit within the time constraint".

Hungarian words are usually longer then english words, so the translation is hard enough already considering the length requirements, but I've found a lot of occasions where my translated hungarian sentences have less characters as the corresponding english ones and they still get highlighted red. This makes the completion rate only 25% even though more than 90% of translations are not significantly longer.

Given the current constraints I don't see a way to properly reduce the length to sub-english sizes and still sound natural and precise enough. I suggest loosening the time constraints to +10% or so, as the video contains a lot of spare time between sentences which could be used up by the longer narration.

3b1b commented 7 months ago

Good to know, I can look at that.

The time estimates were made based on measuring the average number of seconds per character for each language running our text-to-speech tool, and then a little buffer was given to allow for some flexibility. It sounds like at least for Hungarian, it's either not as flexible as it should be, or getting the dubbing to fit into the allotted time will be a genuine challenge. We can run some tests to see for sure.

For now, I'll artificially tweak things to make it less aggressive for Hungarian.

FaboBence commented 7 months ago

Thank you it made it better for the most part, but it's still not perfect. I'm still findig occasions where translations with less characters are still highlighted, like in the example shown on the attached image. I think the timestamps cause the problem, because when I play the video for the current box (blue play button) the page automatically stops before the end of the english sentence.

image
3b1b commented 7 months ago

It's not clear to me that comparing with the English character count is the right measure since the number of characters for each second of spoken audio will differ from one language to the next. To take an extreme example, if we were translating to Chinese, then a translation could easily be way too long even if the character count was lower.

As to the timestamps, I believe they are correct, but I have noticed an occasional bug in the way it interacts with the player, causing it to stop short of the specified end time stamp.

Still, your point is well taken that we should do some careful tests on Hungarian to be sure that the time estimates are reasonable.

FaboBence commented 7 months ago

Yes, I understand. The original talk speed varies but the website has a static average char/second check, thus sometimes +10% hungarian characters are ok, sometimes only -10%. But then do we really have to tweak it until it fits, or will the AI be able to speed up? And will you only use translations that have reached 100% completion rating, or can we somehow signal to you that we think it's ready despite not reaching this level? I mean I'm just curious about the whole AI voiceover workflow part.

3b1b commented 7 months ago

We'll do some tests to check. We can certainly automatically speed up the audio, the worry is if it will be unnaturally rushed when we do.

The full workflow is still being worked out, but for now, I'll note your comments here and have folks try out the AI voice-over on these videos, and send it along to you.

FaboBence commented 7 months ago

Thanks! Happy to help. Like all your videos its for a noble cause.