Closed BaGRoS closed 10 months ago
I'm currently working on the transcription editing feature, but we'll roll it out in multiple steps:
Unfortunately, the translation function cannot work the way you proposed out of the box because Whisper is using audio data to do the translations (speech-to-text). I'm not aware of other text-to-text models that are as good as Whisper is in translating a lot of the languages you throw at it, so I'd keep the translation process as it is for now. Maybe a route to consider is fine-tuning or training Whisper with your own text correction, but I'm not sure how feasible this is for the average user.
Isn't that saying exactly what I was saying before?
Yes, and text to voice should be easier even with Google voice or Microsoft (Windows), translation from that generated files should be perfect, pure voice no noises...
Sure! But I don't think there's anything to gain from doing that, since:
Re.1. sometimes this way can be quicker Re.2. Yes, but I'm sure if can make mistakes in source language, then 100% make also mistakes in translation, so twice times for corrections Re.3. could be, Narrator inside Windows - more investigation needed Re.4. for now - agree
Another issue is the fact that Whisper splits the phrase segments based on its internal algorithm, so translating phrases while respecting the timings of the segments would be difficult.
For eg.:
Elena. how nice to see you. what a wonderful surprise to meet you here. you're looking
wonderful. thank you. you're looking well too. this is my good friend Dr. Heywood Floyd.
As you see, you're looking wonderful.
was split between two segments, which means that we'd either need to detect the split in order to merge the phrase before translation, or risk inaccurate translations. And if we do manage to merge the phrase, how would the segment look on the translation?
Of course. I'm thinking of translating the Polish subtitles into English after the editing in the original language has been completed, so when everything is already put together into decent Polish subtitles.
Hi
Once speech has been transcribed into text, it should be possible to edit this text directly in the window where it is displayed. After which there should be a button at the bottom of the window: SAVE and TRANSLATE. After pressing the first one, the edited subtitles are saved, after pressing the second one they are translated by Whisper and opened in the window for further editing with the SAVE button. Saving can also be automatic, but will then unnecessarily consume disks in particular SSD.
BaGRoS