mantas-done / subtitles

Subtitle/caption converter
https://gotranscript.com/subtitle-converter
MIT License
142 stars 49 forks source link

Problem with changing <v speaker> to speaker: #61

Open vnali opened 1 year ago

vnali commented 1 year ago

Hi. First of all thank you for this very useful class, I am trying to pass a vtt string with <v speaker> in it to this class and then add a new line to it and return vtt format again. the problem is vttConvertor changes <v speaker> to speaker: - in this case it should not, because i want the vtt format again -

I have also a related suggestion/solution: Maybe instead of changing the speaker and adding it to the body in fileContentToInternalFormat function, we should keep parsed speakers for each line in a new key in $internal_format like

   (
        [start] => 137.44
        [end] => 140.375
        [lines] => Array
            (
                [0] => Senator, we're making
                [1] => our final approach
            )
        [speakers] => Array
            (
                [0] => speaker1
                [1] => speaker2 // or maybe no speaker
            )
    )

and decide later based on the destination format in internalFormatToFileContent function, how we want the speaker values in the body lines. -so in vttConverter we prepend it as to line[0] and on srtConverter we prepend it as speaker1:-

Another good reason for this is some new formats like html subtitles or json subtitles have another separate tag for speaker value and they don't include the speaker in the body lines at all.

mantas-done commented 1 year ago

Hi, thank you for the suggestions. At this point, I am trying to just support conversion between different formats without styles. And with limited time it would be too much work to add speaker support to all the converters and to properly test them.

vnali commented 1 year ago

Thank you @mantas-done For fixing the issue as a workaround i used getInternalFormat() and setInternalFormat() like this to convert v: to <v></v> again

        $vtt = Subtitles::loadFromString($vttString, 'vtt');
        $internalFormat = $vtt->getInternalFormat();
        // loop through lines and convert `v:` to `<v></v>` again
            foreach ($internalFormat as $format) {
                foreach ($format['lines'] as $key => $line) {
                ....
                }
            }
        //
        $vtt->setInternalFormat($newInternalFormat);
        $vtt->add(....);
        $newVtt = $vtt->content('vtt');
mantas-done commented 1 year ago

Looking again at your issue. Your suggestion is actually very good. If you would be willing to implement storage of speakers in the internal_format, I would merge it.

vnali commented 1 year ago

@mantas-done sure, I will take a look at the code to see if I can do it. I know how speaker appears on the srt and vtt subtitles but is there any reference about other formats?

mantas-done commented 1 year ago

If you want, you can implement it just for vtt format at first. I would suggest adding two additional array elements: speakers (as in your example) and lines_without_speaker. Vtt will be able to use those additional parameters. Other formats will use what they are currently using. And later other formats can be also updated.

On Tue, Aug 22, 2023, 18:07 vnali @.***> wrote:

@mantas-done https://github.com/mantas-done sure, I will take a look at the code to see if I can do it. I know how speaker appears on the srt and vtt subtitles but is there any reference about other formats?

— Reply to this email directly, view it on GitHub https://github.com/mantas-done/subtitles/issues/61#issuecomment-1688377067, or unsubscribe https://github.com/notifications/unsubscribe-auth/AECVOJW3FX5KPMWEJBMTIMLXWTDKJANCNFSM6AAAAAAZPJOUUM . You are receiving this because you were mentioned.Message ID: @.***>