Closed McFateM closed 7 years ago
Additional note... this behavior, of producing DSIDs with the name of the OH interviewee just started happening in about the last week, since my pull of the 7.x branch late last week.
@McFateM We've had use cases where there was the desire to have multiple WebVTT files with different languages. As an interim solution, postpending the WebVTT filenames with an underscore and the language code was used. (For example, see pull-request https://github.com/digitalutsc/islandora_solution_pack_oralhistories/pull/89.) When no language code is given, English (en) is assumed (the default configuration value of the "Specify VTT default language code(s)" text area in the oral histories configuration).
When viewing the object reported above, do you see "bogie" as an option under the CC when clicking it when playing the oral history object (video or audio)? If so, when you click it, do you see the transcript being played?
Yes, “bogie” does appear under the CC selection options and if I select it the captions appear. I see that when a MEDIATRACK datastream is present I get an “en” option under CC and that is apparently the default because captions appear without me having to take any action.
So is there a simple fix for this? I know my users are not even aware of what the CC button does. Would it be relatively easy to make anything but “captions off” the default?
So what's happening is the file naming conventions are conflicting. If you rename your WebVTT files before ingest to remove the "_{interviewee-name}", the observed behaviour won't happen. You'll get the default 'en' (English) transcript displayed.
I don’t think I can rename them before ingest, they are derivatives created by the ingest process. I don’t create any VTTs of my own. My ingest is simply the old
OK, looking at the code I see what’s up. The get_dsid_suffix() function assumes that if you have any underscore in your transcript file name, what follows that must be the language code.
Let me see if I can overcome that by changing the name of the transcript file to use dashes instead of underscores.
@McFateM To clarify then, you're ingesting a transcript XML file named like ioh_bogie.xml
and the _bogie
is applied to the generated WebVTT file. This was so that you could set the default language for an object on an Islandora site with oral history objects in a (default) multi-languages. The relevant pull-request is https://github.com/digitalutsc/islandora_solution_pack_oralhistories/pull/88. I'm not certain that this has been adequately documented in the wiki. Apologies for this. I'll confer with my colleagues and make sure that the documentation is clearer.
So for example, if you upload the file name as ioh.xml
, the default will be 'en' assuming that you have 'en' as you default setting. On the other hand, if you upload 'ioh_fr.xml', French captions will be displayed by default for the object, overriding the solution pack default. (Again, this is so that you deal with situations where you have a collection of oral history objects with one default language, and another collection that uses a different default language, all on one Islandora instance.)
Ok, my workflow change worked…I made sure my transcript filename contains no underscores, and what I got was a single MEDIATRACK datastream that displays correctly after ingest. I’ll address this in my workflow script tomorrow…no more underscores allowed in our filenames! Thanks Marcus.
When I ingest a new OH object or regenerate all derivatives I get transcript derivatives like those shown below. Note the name of the interviewee appearing in the two datastream IDs.
Under these conditions I don't see any captions in my video window. It looks like this...
However, if I download the MEDIATRACK_bogie datastream, and then add it back as a new datastream with a DSID of just "MEDIATRACK", my captions are displayed like so...
Note that when the MEDIATRACK datastream is created a new INDEXMEDIATRACK derivative is also automatically generated.