Open jmartin-sul opened 2 months ago
Note: at the moment, this step is implemented using the exact same algorithm used by the ocr update-cocina step, which means it picks up new files in the workspace generated by Whisper, finds resources with base filenames that match, and then adds them to the resource.
We still need to consider:
the developer who picks this up should consult with andrew on whether the code we have is doing the right thing, or if it needs tweaking
Roles:
".vtt" = "caption" ".txt" = transcription
Publish/preserve/shelve = true for vtt and txt files
leaving open till we confirm that the captions generated by speechToTextWF
actually work in sul-embed when watching a video
captions confirmed to display correctly
Here is one to check: https://sul-purl-stage.stanford.edu/bh691ds2057
I see a transcript panel in sul-embed
for now, blocked by the implementation of preceding steps in
speechToTextWF
, but we could probably parallelize this with work on the preceding steps, if we wanted. we might be able to work off of theocrWF
equivalent as an example?this is the
speechToTextWF
equivalent ofocrWF
update-cocina
stub code in the
speechToTextWF
: https://github.com/sul-dlss/common-accessioning/blob/main/lib/robots/dor_repo/speech_to_text/update_cocina.rbpart of https://github.com/sul-dlss/common-accessioning/issues/1363