aws / aws-sdk-js-v3

Modularized AWS SDK for JavaScript.
Apache License 2.0
2.95k stars 555 forks source link

AWS Transcribe Streaming HTTP2 error message with WebToken #5958

Open MartinEmrich opened 2 months ago

MartinEmrich commented 2 months ago

Checkboxes for prior research

Describe the bug

AWS Transcribe Streaming fails with errors like

{\"code\":\"ERR_HTTP2_ERROR\",\"errno\":-505,\"$metadata\":{\"attempts\":1,\"totalRetryDelay\":0}}
Error [ERR_HTTP2_ERROR]: Protocol error\n    at Http2Session.onSessionInternalError (node:internal/http2/core:801:26)

SDK version number

@aws-sdk/client-transcribe-streaming 3.540.0

Which JavaScript Runtime is this issue in?

Node.js

Details of the browser/Node.js/ReactNative version

v20.5.1

Reproduction Steps

Use Transcribe Streaming from a service within an AWS EKS Kubernetes cluster with an IAM service account.

const client = new transcribe.TranscribeStreamingClient();
// ...
const response = await client.send(new transcribe.StartStreamTranscriptionCommand(transcribeParams));
// ...

Observed Behavior

Running the service (in fact the same container image) locally (AWS SSO auth) or on an EC2 instance (IAM instance role) works as expected.

Running it on AWS EKS with an IAM service account (Token file authentication), the STSClient call made by the Transcribe client fails:

{"name":"xxx","hostname":"xxxx","pid":7,"level":50,"clientName":"STSClient","commandName":"AssumeRoleWithWebIdentityCommand","input":{"RoleArn":"arn:aws:iam::###########:role/eksctl-xxxx-addon-iamservic-Role1-xxxxxx","RoleSessionName":"aws-sdk-js-session-1712219858427","WebIdentityToken":"***SensitiveInformation***"},"error":{"code":"ERR_HTTP2_ERROR","errno":-505,"$metadata":{"attempts":1,"totalRetryDelay":0}},"metadata":{"attempts":1,"totalRetryDelay":0},"msg":"","time":"2024-04-04T08:37:38.452Z","v":0}

Expected Behavior

I expect it to work with IRSA just as with any other AWS API authentication method

Possible Solution

Inspired by the workaround in #2533, I got it working in a similar way, by explicitly creating the credential provider (in my case, the fromTokenFile provider):

const awsCredentialProviders = require('@aws-sdk/credential-providers');
const credentials = awsCredentialProviders.fromTokenFile({ logger, clientConfig: { region: 'eu-central-1' } });
const client = new transcribe.TranscribeStreamingClient({ logger, credentials });

Additional Information/Context

aBurmeseDev commented 2 months ago

Hi @MartinEmrich - thanks for reaching out and sorry to hear about the issue.

While I investigate, I'd like to clarify how you're obtaining credentials on the code example that gave you the error since you mentioned you were able to workaround with fromTokenFile credential-provider. Can you share the code that works vs the one that throws the error to give us more insight?

MartinEmrich commented 2 months ago

@aBurmeseDev See above: I use an IAM Service Account within an EKS Cluster, which effectively is the web token auth. (EKS automatically injects the web token and sets the AWS_WEB_IDENTITY_TOKEN_FILE environment variable. Using any official AWS SDK, this works automatically without any extra code.)

So to reproduce:

aBurmeseDev commented 2 months ago

@MartinEmrich - thanks for getting back. Sorry if I wasn’t clear on my question but I was looking for your working vs non-working code to get more understanding of the issue. The more detailed information we can get, the faster we can debug and identify the root cause.

Just to provide some background on the error: this protocol error can occur for a number of reasons, generally related to HTTP/2 communication issues. Here are a few that I narrowed down based on the information you provided:

const client = new Client({ requestHandler: new NodeHttp2Handler({ requestTimeout: 100, sessionTimeout: 300, }), });


- Load balancing or Proxies: Since you're routing traffic through a load balancer or proxy server, misconfigurations or limitations in handling HTTP/2 might be the culprit, especially when your EC2 workflow is working as expected whereas EKS workflow is throwing error. I would recommend that you verify HTTP2 support is enabled for EKS.
- Lastly, I would check to ensure that there are no other services, AWS or third-party, being used without HTTP2 support.

If issue persists, please share more requested information so that I can further assist you.

Best,
John
github-actions[bot] commented 1 month ago

This issue has not received a response in 1 week. If you still think there is a problem, please leave a comment to avoid the issue from automatically closing.

MartinEmrich commented 1 month ago

Hello John! Sorry for the delay, I somehow overlooked your response.

Again, my working vs. non-working code is in the original post, if you are missing a specific information, please be specific.

The issue is not in the Transcribe client or server itself (actually, as I understand, transcribe not only supports, but actually requires HTTP/2 for streaming). The issue seems to be in the authentication step.

My speculation: The issue seems to be the SDK trying to use HTTP/2 to talk to the OIDC token endpoint while authenticating. And that happens obviously only if the used authentication method is actually WebToken auth. That's why it does not happen with any other authentication method (IAM Role, Key/Secret, ...).

Your code example would do the opposite: force the Transcribe client to use the NodeHttp2Handler (which it seems to use already, as the Transcribe streaming works fine) Instead the issue seems to be the auth code (IAM, OIDC) using HTTP/2 instead of HTTP (my suspicion: it re-uses any HTTP client previously created within the SDK, which happens to be the HTTP2Handler created by the Transcribe client). That's why the workaround works: It calls the authentication code explicitly before the Transcribe client (and the HTTP2 handler) is created, thus defaulting to the normal HTTP 1 client. Only after authentication happened, the Transcribe client is created and now has valid authentication.

I do not use any proxies or load balancers. There is nothing custom between the code and the AWS API endpoints; it runs in plain AWS EKS cluster with IAM service account in an AWS VPC.