indiana-university / automated-transcription-service

BSD 3-Clause "New" or "Revised" License
2 stars 0 forks source link

Capture key data in DynamoDB #15

Open alan-walsh opened 6 months ago

alan-walsh commented 6 months ago

Emily comments

Hi all! As I was running jobs this morning, I had a thought about the reporting stuff we talked about--extracting information from logs. In addition to knowing the file name that was transcribed (to match it to a project/person) and knowing the audio length (to help approximate the cost/value), I wonder if it would be possible to pull out the confidence score? That might give us information on at least how well Amazon thinks it's doing! The other thing is that if this is complicated to do, I'm still at a point with this service where I could manually log audio files and their lengths moving forward, or maybe write a little script I could run on the json files to extract that information without having to go to the logs? (In Amazon as well as GCP) Just a thought.

Step function and DynamoDB

Step function is probably a better way to perform Transcribe-to-Docx anyway, so this could be a step in that process. Write a record for each job into DynamoDB, which would make reporting that much easier.