aws-solutions / content-localization-on-aws

Automatically generate multi-language subtitles using AWS AI/ML services. Machine generated subtitles can be edited to improve accuracy and downstream tracks will automatically be regenerated based on the edits. Built on Media Insights Engine (https://github.com/awslabs/aws-media-insights-engine)
Apache License 2.0
41 stars 18 forks source link

Batch upload results in captions associated with wrong video #390

Closed jkbredsquiggles closed 9 months ago

jkbredsquiggles commented 1 year ago

Describe the bug

First, a disclaimer, this occurred the first time I used the project, so it may be user error. Also, I'm providing this from memory, so a couple of details may be hazy (I've noted where I am not sure of the details).

I processed a bunch of videos (6) using the same workflow, in a batch. I reviewed few of the completed results and the wrong captions had been associated with a few of the videos. Two of the videos failed to upload (the UI didn't provide an explanation and I didn't think to check the cloud watch logs)

I believe that I reviewed a few videos and two of them had the same wrong captions, while one had the correct captions. So it MAY be the case that all videos ended up with the same captions.

I did not check the bucket (at the time I didn't know about it) to see whether the SRT files were wrong or whether the UI might just be presenting the wrong SRT files for a given video.

Note that I don't know if batch processing is supported. The site allowed me to do it, but maybe only one video can be processed at a time.

To Reproduce

  1. Log in to the site
  2. add multiple videos
  3. Create a workflow, in this case I accepted the default and added a translation to french canadian
  4. Wait
  5. Review the collection, pick a video to analyze
  6. Look at the transcript (or the translation) - it's for one of the other videos
  7. Watch the video, the captions are for another video (I only checked this for one video)

Expected behavior

Each video would have it's own captions

Please complete the following information about the solution:

Screenshots

Sorry, I didn't think to take a screen shot.

Additional context

While reviewing the SRT files for my first batch, in a separate browser tab, I tried to upload more videos (the ones that had errors in the first batch). That failed (sorry, no details). I'm not sure whether uploading while viewing the results of a previous workflow would cause a problem or not.

jkbredsquiggles commented 1 year ago

This may be a red herring, but I very briefly review the cloudformation stack and noticed the following parameter in the MieStack:

MaxConcurrentWorkflows: 5

I was attempting to process 6 files in my batch. If that parameter has anything to do with the number of items that can be processed in a batch, it may be that the UI should restrict the batch size, but processing is OK when the batch size is less than that threshold.

raulmlamzn commented 1 year ago

@jkbredsquiggles Thanks for reporting this issue. You can read more about the parameter in the API documentation below. I added your UI request to our backlog for the solution. "MaxConcurrentWorkflows - Sets the maximum number of workflows that are allowed to run concurrently. Any new workflows that are added after MaxConcurrentWorkflows is reached are placed on a queue until capacity is freed by completing workflows. Use this to help avoid throttling in service API calls from workflow operators. This setting is checked each time the WorkflowSchedulerLambda is run and may take up to 60 seconds to take effect." https://docs.aws.amazon.com/solutions/latest/media-insights-on-aws/api-reference.html