voc / voctoweb

voctoweb – the frontend and backend software behind media.ccc.de
GNU General Public License v3.0
188 stars 55 forks source link

Make youtube subtitles community editable #441

Closed dickshaydle closed 4 years ago

dickshaydle commented 4 years ago

The 36c3 videos are transcribed automatically but this can lead to some mistranslations, for example Humus Sapiens where the person talks about "desertification" and the algorithm detects "the certification".

Youtube has a quite nice interface for editing the transcripts even for people other than the channel owner. But this has to be activated so people can make changes. I don't know how much effort there is for you after a correction is submitted.

Correcting the subtitles could help deaf people who can not hear the correct words otherwise.

saerdnaer commented 4 years ago

We prefer the users to use our existing infrastructure at https://c3subtitles.de – you can navigate to the editor via the first link in the description, in this case https://media.ccc.de/v/36c3-11043-humus_sapiens and then scroll down to the 'subtitle this task' link.

In in the specific case: check if the problem is also in the etherpad https://subtitles.pads.ccc.de/36c3-talk-11043 and fix it there. More information in the header of the pad.

dickshaydle commented 4 years ago

This seems rather complicated. Amara also seems to be an external service. What is the benefit of that? Some thoughts on that:

  1. this is another service to register to.
  2. there i also don't have the rights to edit something.
  3. there is no autogenerated text to start from.
  4. there is another place where subtitles are stored, the etherpad (https://subtitles.pads.ccc.de/36c3-talk-11043). This further complicates things. In the etherpad there seems to be a much better transcript than on youtube. But this does not end up on youtube or on media.ccc.de. To be precise there are no subtitles at all on the media.ccc.de version or at amara.
saerdnaer commented 4 years ago

Actually there are subtitles on media.ccc.de – I did an import of all complete subtitles at Januar 3rd (#382), and am currently working on a sync script to continuously publish complete subtitles to media.ccc.de (#383). We also plan to extend this to youtube and overwrite the automatically generated subtitles with the completed ones from @c3subtitles during the next weeks...

An example talk with multiple language subtitles is https://media.ccc.de/v/32c3-7550-opening_event

/cc @percidae