wet-boew / wet-boew

Web Experience Toolkit (WET): Open source code library for building innovative websites that are accessible, usable, interoperable, mobile-friendly and multilingual. This collaborative open source project is led by the Government of Canada.
https://wet-boew.github.io/wet-boew/index-en.html
Other
1.61k stars 661 forks source link

Multimedia Player: Converting HTML captions to YouTube SRT #6286

Closed SlaneyChris closed 9 years ago

SlaneyChris commented 10 years ago

My question is:

I have several videos that are currently playing in the WET player, with in-line captioning. These captions work well. Now I would like to upload the same videos to YouTube. I want to put the same captions on those videos. Is there an easy way to convert the html5 captions to a captioning file that is supported by Youtube? (.srt etc) I have a lot of videos so I would prefer not to have to rewrite all the captions.

Here's an example of the code I am referring to: (i.e. dialogue dialogue blah blah blah)

Thanks

LaurentGoderre commented 10 years ago

I don't think such thing exist but it should be fairly easy to write a converter using nodejs

SlaneyChris commented 10 years ago

Thanks for responding and adding tags! This is not my area of expertise so I don't think I would be capable of writing a converter. That advice helps though, thank you.

LaurentGoderre commented 10 years ago

Maybe better yet, I can maybe create script to extract the captions from the player and output the SRT

LaurentGoderre commented 10 years ago

@Slaney, in Firefox, navigate to the page with the video, press Shift-F4 to bring up the scratchpad and copy the following script and substituting the ID with the ID of your video. Press the display button in the top of the scratchpad and it will output the SRT for that video

formatTime = function( time ) {
  var index = 2,
      timecode = "",
      secondsIn, current, pad;

  pad = function( number, digits ) {
    return new Array( Math.max( digits - String( number ).length + 1, 0 ) ).join( 0 ) + number;
  };

  time = Math.floor( time );

  //Loop to extract hours, minutes and seconds
  while ( index >= 0 ) {
    //Get the number of seconds for the current iteration (hour, minute or second)
    secondsIn = Math.pow( 60, index );
    current = Math.floor( time / secondsIn );

    if ( timecode !== "" ) {
      timecode += ":";
    }

    timecode += pad( current, 2 );
    time -= secondsIn * current;
    index -= 1;
  }
  return timecode;
}

function captionsToSRT (videoID) {
  var captions = $.data(document.getElementById(videoID), "captions"),
      captionsLength = captions.length,
      srt = "",
      c, caption, srt;

  for (c = 0; c < captionsLength; c++) {
    var index = c + 1;

    caption = captions[c];

    srt += index + "\n" + formatTime(caption.begin) + " --> " + formatTime(caption.end) + "\n" + caption.text + "\n\n"
  }

  return srt;
}

captionsToSRT("VIDEO_ID");
SlaneyChris commented 10 years ago

Wow thanks so much for this! I'm going to test it first thing tomorrow morning.

SlaneyChris commented 10 years ago

So I'm not quite sure that I followed your instructions properly. I am testing it on this video:

http://www.statcan.gc.ca/eng/sc/video/cpi

Where should the file be outputted?

LaurentGoderre commented 10 years ago

If you run it from the scratchpad and click the display button it should display under the script

LaurentGoderre commented 9 years ago

@SlaneyChris did this work for you?

SlaneyChris commented 9 years ago

I had a friend look at it and he gave me a version of the code with the Video ID included etc. That version is here: http://pastebin.com/X89ibKmj

Yet it still gives me "undefined" when I enter it into the scratchpad and click "display." Maybe I'm missing something very basic...

LaurentGoderre commented 9 years ago

Remove the -media from the ID. It's not the ID of the video tag you need but the IF of the element with the wb-mltmd class on

LaurentGoderre commented 9 years ago

Also, console.log doesn't work in the scratchpad so remove that as well. The new code would be

formatTime = function( time ) {
  var index = 2,
      timecode = "",
      secondsIn, current, pad;

  pad = function( number, digits ) {
    return new Array( Math.max( digits - String( number ).length + 1, 0 ) ).join( 0 ) + number;
  };

  time = Math.floor( time );

  //Loop to extract hours, minutes and seconds
  while ( index >= 0 ) {
    //Get the number of seconds for the current iteration (hour, minute or second)
    secondsIn = Math.pow( 60, index );
    current = Math.floor( time / secondsIn );

    if ( timecode !== "" ) {
      timecode += ":";
    }

    timecode += pad( current, 2 );
    time -= secondsIn * current;
    index -= 1;
  }
  return timecode;
}

function captionsToSRT (videoID) {
  var captions = $.data(document.getElementById(videoID), "captions"),
      captionsLength = captions.length,
      srt = "",
      c, caption, srt;

  for (c = 0; c < captionsLength; c++) {
    var index = c + 1;

    caption = captions[c];

    srt += index + "\n" + formatTime(caption.begin) + " --> " + formatTime(caption.end) + "\n" + caption.text + "\n\n"
  }

  return srt;
}

captionsToSRT('wet-boew-mediaplayer0');
SlaneyChris commented 9 years ago

Now I'm getting this - not sure what's going wrong. / Exception: $ is undefined captionsToSRT@Scratchpad/5:30 @Scratchpad/5:46 /

Thanks for your help again

LaurentGoderre commented 9 years ago

Is jQuery loading on your page?

LaurentGoderre commented 9 years ago

Any update on this?

hsrudnicki commented 9 years ago

@SlaneyChris @LaurentGoderre : Are you ready to close this issue?

LaurentGoderre commented 9 years ago

Closing due to inactvity

crochefort commented 7 years ago

@LaurentGoderre ... seem's that I have been able to extract my timing .. will try to build the Youtube CC today...

Might be good to put this script in the Doc ... to help people ...