indiana-university / automated-transcription-service

BSD 3-Clause "New" or "Revised" License
2 stars 0 forks source link

Enable AWS batch jobs for long-running transcribe-to-docx? #11

Open alan-walsh opened 6 months ago

alan-walsh commented 6 months ago

We ran into the 15 minute limitation for json to docx translation. We made tweaks to lower the chance of hitting the limit. However, it seems with large files (approximately 4+ hours?) we would eventually still hit the limit. Does it makes sense to convert this to an AWS Batch job? What is the effort? Should it dynamically branch to optimize cost savings, if any? How often would we hit the limit? The only option when hitting the limit would be running the json through command line conversion.

alan-walsh commented 6 months ago

For future reference: This might be the point at which we pivot to using an AWS step function for ATS. That would allow us to very easily create branching logic that could choose between Lamba/Batch based on the length of the transcription (audio).