But per standup and Slack discussion, we'll decide on the workflow name and the terminology for audio/video text extraction in 2024-09-13 post-standup discussion. Discussion seems to be leaning towards "caption" or "speechToText" as the term to use in general.
The workflow XML is not yet in place, but the skeleton code is already present. The latter might need a rename, pending the above decision. I'll mark this ticket blocked on that decision.
The skeleton workflow with placeholders for steps we expect, with implementations filled in as supporting services are developed, will be similar to what we did for ocrWF.
Steps we're likely to have, in order:
writing media files to S3
(maybe) signaling that those files need to be transcribed (if simply writing them to the bucket isn't enough)
picking up a workflow step signaling that the transcription output has been placed in an S3 bucket
updating the cocina appropriately and accessioning the output files
Currently, there is placeholder workflow step code for starting the workflow, ending it, and generating captions (this last part will very likely be broken into the multiple steps describe above).
The start of the code for this is already there, and assumes the name will be
captionWF
: https://github.com/sul-dlss/common-accessioning/tree/main/lib/robots/dor_repo/captionBut per standup and Slack discussion, we'll decide on the workflow name and the terminology for audio/video text extraction in 2024-09-13 post-standup discussion. Discussion seems to be leaning towards "caption" or "speechToText" as the term to use in general.
The workflow XML is not yet in place, but the skeleton code is already present. The latter might need a rename, pending the above decision. I'll mark this ticket blocked on that decision.
The workflow definition would be a new XML file here: https://github.com/sul-dlss/workflow-server-rails/tree/main/config/workflows
The skeleton workflow with placeholders for steps we expect, with implementations filled in as supporting services are developed, will be similar to what we did for
ocrWF
.Steps we're likely to have, in order:
Currently, there is placeholder workflow step code for starting the workflow, ending it, and generating captions (this last part will very likely be broken into the multiple steps describe above).