polimediaupv / paella

Paella standalone html 5 multistream player
http://paellaplayer.upv.es/
Educational Community License v2.0
62 stars 65 forks source link

How to add (multi lingual) subtitles in Paella Player #437

Closed herwig-rehatschek closed 5 years ago

herwig-rehatschek commented 5 years ago

Hello,

we are using your excellent player - THANKS A LOT! - in connection with Opencast. For paella player we use version 6.1.5. On your website I saw that Paella player is capable to show subtitles, even in many languages (https://paellaplayer.upv.es/demos/multi-language-captions/). So my questions:

1) which format has to be used to create subtitles? I guess it's similar to YouTube format? Do you have maybe an example as well?

2) how can I technically add them to our system - is there any documentation available on this issue? i.e. where to put the files so the player can read them, how to configure the player so you can select it from the user interface

Thank you in advance for any hints in this connection, kind regards from a bright and sunny Austria to sunny Barcelona, Herwig

karendolan commented 5 years ago

HI @herwig-rehatschek

The following are my thoughts as an Opencast & Paella user, sparked by your post. They may not be very helpful and I hope you get more easy-to-digest comments from the community.

These are all from the Opencast side. I believe that once the captions are separated into languages and passed to the core Paella, Paella is able to show the different languages. But, our site only uses one caption language.

Best Regards from Boston, Karen

  1. The Opencast plugin that converts Opencast mediapackage into data for the core Paella player lives in the Opencast github repository. Currently the Opencast-to-Paella data converter expects a separate single caption catalog/attachment for each language. The language currently needs to be defined outside of the file in the Opencast mediapackage, either in the attachment's "tag" attribute or the attachment's "flavor" attribute.

https://github.com/opencast/opencast/blob/develop/modules/engage-paella-player/src/main/paella-opencast/plugins/es.upv.paella.opencast.loader/03_oc_search_converter.js#L256-L281

Tag style: "lang:\<language code>" , i.e. "lang:en" Flavor style: "captions/\<extension type>+\<language code>", i.e. "captions/dfxp+en"

  1. A quick and ugly way to do this in Opencast, if you don't have too many languages, is to add extra rows to the event upload config: https://docs.opencast.org/develop/admin/configuration/admin-ui/asset-upload/#how-to-create-a-new-asset-option Example of adding 3 new upload options rows in the config for lang en, gb, xy: EVENTS.EVENTS.NEW.UPLOAD_ASSET.OPTION.CAPTIONS_WEBVTT_EN={"id":"attachment_captions_webvtt", "type": "attachment", "flavorType": "text", "flavorSubType": "webvtt+en", "displayOrder": 3} EVENTS.EVENTS.NEW.UPLOAD_ASSET.OPTION.CAPTIONS_WEBVTT_GB={"id":"attachment_captions_webvtt", "type": "attachment", "flavorType": "text", "flavorSubType": "webvtt+gb", "displayOrder": 3} EVENTS.EVENTS.NEW.UPLOAD_ASSET.OPTION.CAPTIONS_WEBVTT_XY={"id":"attachment_captions_webvtt", "type": "attachment", "flavorType": "text", "flavorSubType": "webvtt+xy", "displayOrder": 3} Note that new tags like "EVENTS.EVENTS.NEW.UPLOAD_ASSET.OPTION.CAPTIONS_WEBVTT_EN" will not automatically exist in the Admin UI translation tables.

  2. A long and pretty solution is to develop a utility that parses the caption file and extracts the language and multiple language translations contained within that caption file. The current 03_oc_search_converter.js, linked above, contains a commented "TODO" to parse the different language captions from a single caption file. This task requires using a DFXP or WebVTT parser to parse the DFXP or WebVTT file to extract the multiple language captions from within a single captions file. As mentioned above, currently, the caption is tagged as DFXP with a single language first, then the file is retrieved and parsed later. To extract languages from the retrieved catalogs requires reversing part of the process.

    attachments.forEach((currentAttachment) => { try { let captions_regex = /^captions\/([^+]+)(+(.+))?/g; let captions_match = captions_regex.exec(currentAttachment.type);

    if (captions_match) {
      let captions_format = captions_match[1];
      let captions_lang = captions_match[3];
    
      // TODO: read the lang from the dfxp file
turro commented 5 years ago

Hi @herwig-rehatschek

Yes, strictly in the paella side (outside of Opencast) you have multiple language captions support. I believe it supports .dxfp, .vtt (webtt) and .srt formats

The code for using that is in the data.json file in the section captions. something like

"captions": [
        {
            "lang": "es",
            "text": "Español (traducción automática)",
            "format": "dfxp",
            "url": "captions/es.dfxp"
        },
        {
            "lang": "en",
            "text": "English (automatic transciption)",
            "format": "dfxp",
            "url": "captions/en.dfxp"
        },
        {
            "lang": "ca",
            "text": "Valencià/Català (traducció automàtica)",
            "format": "dfxp",
            "url": "captions/ca.dfxp"
        }
    ]

Regarding Paella & Opencast, it is like Karen said. The Opencast plugin to convert an Opencast mediapackage for the core Paella player takes care for it.

Also let me point that we have an open pull request for OC7 to support multiple audio tracks (several languages) MH-13704: Multiple audio tracks support on paella I don't know if this could change something in the way we convert the captions from opencast. Maybe @miesgre could say more on this

Thanks a lot for your support!

Carlos

miesgre commented 5 years ago

thanks @karendolan and @turro for your great explanation.

If you are using Opencast, the best way is to add multiple files (one per language) to your mediapackage as @karendolan explained.

@turro, the pull request MH-13704: Multiple audio tracks support on paella does not change how captions are loaded.

herwig-rehatschek commented 5 years ago

Dear Karen, dear Turro, dear miesgre,

thank you very much for your fast help and sorry for my late reply ... I am traveling a lot currently and have only sporadically access to Internet.

So I understand from your responses that the best way to add captions is to utilize OpenCast and add one caption file per language by using the OpenCast media package.

The format is dfxp (which says nothing to me but I will Google ;-) ).

And if I use the OpenCast media package it will convert the files in a way that Paella Player could read them and the user can finally select the captions to be shown.

I will pass this on to my system administrator who hopefully will understand it :-) In case of further questions I will come back to you :-)

Thanks a lot and best wishes from (currently) sunny Bangkok, Herwig