aws-samples / amazon-transcribe-live-call-analytics

Amazon Transcribe Live Call Analytics (LCA) Sample Solution
Other
86 stars 58 forks source link

Urgent Issue with Call Transcription: Inconsistent Behavior Observed #94

Closed tarungupta83 closed 1 year ago

tarungupta83 commented 1 year ago

Hi LCA Team,

We're building a solution tailored for a large enterprise using the specified stack. Our current implementation successfully transcribes 60% of the calls. However, the remaining 40% persistently show a status of "In-Progress", and we don't receive any transcripts or audio for these calls.

Upon investigation, our logs indicate that we might not be sending the 'opened' packet within the 5-second timeout window as expected by Genesys. This is perplexing given that 60% of the calls work without any hitches.

We're keen on understanding if our application on Fargate is functioning optimally. We've attached relevant screenshots and logs for your review (note: any PII has been redacted).

Could you assist us in pinpointing the exact error and provide guidance on resolving this inconsistency? We've also received a response from Genesys, attached for reference.


{ "timestamp": "PT5.581928956S", "type": "audiohook", "data": { "dir": "in", "message": { "version": "2", "id": "dummy", "type": "error", "seq": 2, "position": "PT0.0S", "parameters": { "code": 408, "message": "Timed out waiting for opened response" }, "serverseq": 0 } } }, image (can share the logs for more help)

babu-srinivasan commented 1 year ago

Hi tarungupta83, Thanks for reporting the issue. Could you please share the logs (server logs from cloudwatch). Meanwhile, I will try to re-create the issue on my side to debug.

What is the maximum number of calls that are in progress at any given point in time i.e. (number of concurrent sessions)?

Thanks,

tarungupta83 commented 1 year ago

Thank you for your response. Currently, the number of concurrent calls fluctuates between 1 and 6.

Given that the solution resides within our organization's account, sharing CloudWatch logs on a public forum presents challenges. However, I've attached sanitized logs from the Fargate Application with personal data redacted. We're willing to provide the logs directly via email. If you could provide an official email address, it would facilitate our sharing of the necessary details.

logs.zip

tarungupta83 commented 1 year ago

Hi Babu Srinivasan,

Here are some additional observations:

  1. It appears that Fargate is not responding with an "opened" message to Genesys within the 5-second timeout window. (From the Genesys "open" to the "LCA opened" response).
  2. In certain cases, Genesys initiated a reconnect when the Fargate Application didn't send the "opened" message. On the subsequent try, Fargate managed to send the "opened" message, allowing the connection to be established and the call to proceed successfully.
  3. Calls that didn't experience a reconnect attempt from Genesys ended up failing.
babu-srinivasan commented 1 year ago

Thanks for sharing the redacted logs and the additional observations. I have observed something similar only when testing high volume of concurrent calls - ie. I was hitting the soft limits on concurrent sessions for Transcribe streaming in the AWS account I was testing with. But, in your case, it looks like you are only testing a small number of calls.

The audiohook server (fargate) returns 'opened' message after running all the open handlers. One of the key open handlers establishes streaming session with Transcribe. Could you please check the logs for those calls that failed (to ensure Transcribe session is established properly) 1/ check the logs to see if you have any of the following messages (this means the session with Transcribe was established successfully). "=== Received Initial response from TCA. Session Id:" or "=== Received Initial response from Transcribe. Session Id:"

2/ Check the logs for any Transcribe related errors.

Thanks

tarungupta83 commented 1 year ago

No when call fails we do not see any message w.r.t transcription.

Rather after open message we only see, error msg coming from genesys.

I have already shared more than 7 error logs and 1 log for completed call.

Can you point out where and how exactly we send opened msg, as I am not able to see any reason why we don't send opened msg as a response to open. There should be certain steps before we send opened msg, what are those and how can that be logged.

On Sat, 9 Sept, 2023, 8:14 pm Babu Srinivasan, @.***> wrote:

Thanks for sharing the redacted logs and the additional observations. I have observed something similar only when testing high volume of concurrent calls - ie. I was hitting the soft limits on concurrent sessions for Transcribe streaming in the AWS account I was testing with. But, in your case, it looks like you are only testing a small number of calls.

The audiohook server (fargate) returns 'opened' message after running all the open handlers. One of the key open handlers establishes streaming session with Transcribe. Could you please check the logs for those calls that failed (to ensure Transcribe session is established properly) 1/ check the logs to see if you have any of the following messages (this means the session with Transcribe was established successfully). "=== Received Initial response from TCA. Session Id:" or "=== Received Initial response from Transcribe. Session Id:"

2/ Check the logs for any Transcribe related errors.

Thanks

— Reply to this email directly, view it on GitHub https://github.com/aws-samples/amazon-transcribe-live-call-analytics/issues/94#issuecomment-1712528217, or unsubscribe https://github.com/notifications/unsubscribe-auth/ABK2IGL23NXVGC4YZWRWZFLXZR6D3ANCNFSM6AAAAAA4P45EVU . You are receiving this because you authored the thread.Message ID: <aws-samples/amazon-transcribe-live-call-analytics/issues/94/1712528217@ github.com>

tarungupta83 commented 1 year ago

Hi @babu-srinivasan / @rstrahan,

Thank you for assisting with the identified issue. I've provided the CloudWatch logs as requested.

As part of our roadmap, we're gearing up to deploy the complete stack for a client. We anticipate around 20 million calls annually. Your support on this issue is crucial for us.

Additionally, we're interested in availing extended support from AWS. Given that we already hold an enterprise account with AWS, could you guide us on how to officially reach out for this extended service?

Best regards, Tarun +919810698355

babu-srinivasan commented 1 year ago

I have sent you an email to follow-up. Thanks!

rstrahan commented 1 year ago

@babu-srinivasan @tarungupta83 What's the status of this issue?

tarungupta83 commented 1 year ago

This issue is broadly solved, We did changed the setup, while we also updated the code to latest version available. After that we are not facing issues w.r.t call drop.

A small question I would like to ask, can we add multiple TRANSCRIPT_LAMBDA_HOOK_FUNCTION_ARN lambda functions to work parallelly for updating transcript.

rstrahan commented 1 year ago

This issue is broadly solved

Awesome.. I'll resolve this and the other related open issue #95

A small question I would like to ask, can we add multiple TRANSCRIPT_LAMBDA_HOOK_FUNCTION_ARN lambda functions to work parallelly for updating transcript.

Not currently.. Only one Arn can be provided/invoked, but of course your Lambda can do whatever you want, including invoking additional custom Lambda functions... to work parallelly for updating transcript

Good luck, and thanks for using LCA!

rajapradeep commented 4 months ago

I am also facing the same issue. How exactly was it resolved. Can you explain @tarungupta83 @rstrahan @babu-srinivasan.