aws / aws-sdk-js-v3

Modularized AWS SDK for JavaScript.
Apache License 2.0
3.05k stars 572 forks source link

fetch-http-handler is not compatible with built-in fetch implementation in latest versions of Node (18.15.0, 19.8.1) #4619

Closed carlansley closed 4 months ago

carlansley commented 1 year ago

Checkboxes for prior research

Describe the bug

The aws-sdk FetchHttpHandler (or some other part of the v3 aws-sdk internal plumbing) does not handle the official W3C Stream API ReadableStream returned by Node's built-in fetch implementation.

SDK version number

@aws-sdk/fetch-http-handler@3.306.0

Which JavaScript Runtime is this issue in?

Node.js

Details of the browser/Node.js/ReactNative version

18.15, 19.8.1

Reproduction Steps

import { FetchHttpHandler } from '@aws-sdk/fetch-http-handler';
const kms = new KMSClient({
  requestHandler: new FetchHttpHandler({
    requestTimeout: 1000,
  }),
});

const kmsKeyResult = await kms.send(new CreateKeyCommand({ Description: 'test', KeyUsage: 'ENCRYPT_DECRYPT' }));

Observed Behavior

    TypeError: stream.pipe is not a function
      Deserialization error: to see the raw response, inspect the hidden field {error}.$response on this object.

      29 |     });
      30 |
    > 31 |     const kmsKeyResult = await kms.send(new CreateKeyCommand({ Description: 'test', KeyUsage: 'ENCRYPT_DECRYPT' }));
         |                          ^
      32 |     console.log(kmsKeyResult);
      33 |
      34 |     // testing

      at node_modules/@aws-sdk/node-http-handler/dist-cjs/stream-collector/index.js:7:12
      at Object.streamCollector (node_modules/@aws-sdk/node-http-handler/dist-cjs/stream-collector/index.js:5:37)
      at collectBody (node_modules/@aws-sdk/client-kms/dist-cjs/protocols/Aws_json1_1.js:4671:20)
      at collectBodyString (node_modules/@aws-sdk/client-kms/dist-cjs/protocols/Aws_json1_1.js:4673:52)
      at parseBody (node_modules/@aws-sdk/client-kms/dist-cjs/protocols/Aws_json1_1.js:4692:44)
      at parseErrorBody (node_modules/@aws-sdk/client-kms/dist-cjs/protocols/Aws_json1_1.js:4699:25)
      at deserializeAws_json1_1CreateKeyCommandError (node_modules/@aws-sdk/client-kms/dist-cjs/protocols/Aws_json1_1.js:804:21)
      at deserializeAws_json1_1CreateKeyCommand (node_modules/@aws-sdk/client-kms/dist-cjs/protocols/Aws_json1_1.js:789:16)
      at deserialize (node_modules/@aws-sdk/client-kms/dist-cjs/commands/CreateKeyCommand.js:42:73)
      at node_modules/@aws-sdk/middleware-serde/dist-cjs/deserializerMiddleware.js:7:30
      at node_modules/@aws-sdk/middleware-signing/dist-cjs/middleware.js:14:20
      at node_modules/@aws-sdk/middleware-retry/dist-cjs/retryMiddleware.js:27:46
      at node_modules/@aws-sdk/middleware-logger/dist-cjs/loggerMiddleware.js:7:26
      at Object.<anonymous> (src/plugin/nock/nock.spec.ts:31:26)

Expected Behavior

For it to not throw an exception. Something in the aws-sdk is expecting a NodeJS.ReadableStream, not the W3C Streams API ReadableStream it's actually getting.

Possible Solution

The fix is to deal with both kinds of ReadableStream. The workaround I came up with is:

// node-fetch-handler.ts

import { Readable } from 'node:stream';

// eslint-disable-next-line import/no-extraneous-dependencies
import { FetchHttpHandler } from '@aws-sdk/fetch-http-handler';
import type { HttpHandlerOptions } from '@aws-sdk/types';
import type { HttpRequest, HttpResponse } from '@aws-sdk/protocol-http';

function isIterable(input: unknown): input is Iterable<unknown> {
  return typeof (input as Iterable<unknown> | undefined)?.[Symbol.iterator] === 'function';
}

function isAsyncIterable(input: unknown): input is AsyncIterable<unknown> {
  return typeof (input as AsyncIterable<unknown> | undefined)?.[Symbol.asyncIterator] === 'function';
}

export default class NodeFetchHttpHandler extends FetchHttpHandler {
  override async handle(
    request: HttpRequest,
    options?: HttpHandlerOptions
  ): Promise<{
    response: HttpResponse;
  }> {
    const { response } = await super.handle(request, options);

    /**
     * For some reason (bug?) the aws-sdk FetchHttpHandler does not convert the response body to a NodeJS Readable,
     * or does not handle the official W3C Stream API ReadableStream returned by Node's built-in fetch.  Either way,
     * we check if it's iterable and create a NodeJS Readable from it.
     */
    const body: unknown =
      isIterable(response.body) || isAsyncIterable(response.body) ? Readable.from(response.body) : response.body;
    return {
      response: {
        ...response,
        body,
      },
    };
  }
}

Additional Information/Context

Weirdly, the exception is coming from within the node-http-handler, despite the fact we're using the fetch-http-handler.

yenfryherrerafeliz commented 1 year ago

Hi @carlansley, sorry to hear about your issues. The @aws-sdk/fetch-http-handler is intended to be used in browser clients. Could you please let me know why if using nodejs as runtime not to use the handler from the @aws-sdk/node-http-handler package instead?

Thanks!

carlansley commented 1 year ago

Fetch is a standard Web API, now included in Node (and almost all modern Javascript runtimes). I understand it's more performant than the Node http request implementation, and we're standardizing on using it going forward. Also helps in terms of APM using a Web-standard promise-based HTTP client implementation.

My question back is: why wouldn't you want this to work in any Javascript environment that has Fetch? Seems trivial to support it (per my workaround above). The current behavior is just a straight-up bug, versus a conscious choice, it seems to me.

bodelia commented 1 year ago

Adding a note here as it is related to the ReadableStream node built-in.

There is an issue with the GetObjectCommandOutput Body and with the ReadableStream interface implementation from the AWS-SDK.

The Body is supposed to be of type SdkStream<undefined | Readable | Blob | ReadableStream> Except it does not implement Symbol.asyncIterator which is a standard for streams (implementation was in node v10).

It seems the SDK is using its own implementation of node classes but then is missing some of its key features.

If needed, I can open a separate issue for this but I didn't have the time to go through the whole process yet. Will do if it's completely unrelated.

asilvas commented 9 months ago

Fixing this would also likely enable support in the Bun runtime. And to the authors point, it's faster.

Sebastp commented 7 months ago

still having this error and unable to use SDK. How to fix it? I tried downgrading

madsbuch commented 7 months ago

Arrived here as I have had problems with the Bun rumetime like @asilvas.

madsbuch commented 7 months ago

@carlansley this thread saved me! I develop a Bun project that runs on ECS. All the AWS libraries worked beautifully locally and on Fly, but not on the production infrastructure on AWS (Using the same Docker image).

RanVaknin commented 5 months ago

Hi @carlansley ,

I have received the internal ticket you have submitted to AWS support. Thank you for doing that as it brought this issue to our attention once again.

fetch is a standard Web API, now included in Node (and almost all modern Javascript runtimes).

fetch was introduced natively in Node 18 and is still considered experimental (stability 1). Since the JS SDK still officially supports Node 14, and Node 16, we cannot change the implementation to use fetch as it would break customers using older Node runtimes. Once we do drop support for these node versions we can raise this again to see in what capacity do we want to support this experimental fetch.

The current behavior is just a straight-up bug, versus a conscious choice, it seems to me.

This is not a bug because you can use Node-fetch-handler to achieve this functionality. W.r.t Performance considerations, while important, fall under the scope of a potential feature enhancement rather than a bug.

This remains low priority as it is not actionable at this point. Once we drop support for older runtime versions, we can once again raise this internally to discuss how / if we are going to support the native fetch given its still considered experimental.

Thanks again for bringing this to our attention. I have responded to your other concern internally as it is unrelated to this thread.

All the best, Ran~

carlansley commented 4 months ago

@RanVaknin I think you are misunderstanding the problem here. This issue is not about re-writing node-http-handler to use fetch, breaking old versions of Node, etc. This is about fixing the fetch-http-handler, which is currently broken in all versions of Node. And other Javascript environments, it appears from other comments on this issue.

carlansley commented 4 months ago

In the current version of Node (v22), fetch is stable: https://nodejs.org/api/globals.html#fetch

RanVaknin commented 4 months ago

Hi @carlansley,

This issue is not about re-writing node-http-handler to use fetch, breaking old versions of Node, etc.

Your request is to rewrite fetch-http-handler to use fetch, which we cannot do yet because we still support older versions of node that don't have support for fetch. Meaning that if we re-write fetch-http-handler to use fetch, customers running on older versions of Node (14,16) would experience a breaking change.

The solution is to use the dedicated node handler - node-http-handler.

This is about fixing the fetch-http-handler, which is currently broken in all versions of Node.

We explicitly document that node-http-handler is intended for Node.js environments and fetch-http-handler (and xhr) for browsers. Thus, the expectation that fetch-http-handler should also function in Node.js is not aligned with its intended use.

And other Javascript environments, it appears from other comments on this issue.

AWS SDK JS v3 currently only supports the Node.JS runtime. We do not guarantee compatibility with other runtimes like Bun.js.

In the current version of Node (v22), fetch is stable: https://nodejs.org/api/globals.html#fetch

We base our support on the latest LTS version of Node, which is currently v20.

If there is a reason why you cannot use the dedicated Node.js handler - node-http-handler please let us know so we can try and come up with a solution for you.

I don't mean to repeat the same answer over and over, but I think there is a miscommunication somewhere. If I still didn't quite understand the problem, feel free to add more information and I'll raise it once again with the team for further discussion.

Thanks again, Ran~

carlansley commented 4 months ago

Hi @carlansley,

This issue is not about re-writing node-http-handler to use fetch, breaking old versions of Node, etc.

Your request is to rewrite fetch-http-handler to use fetch, which we cannot do yet because we still support older versions of node that don't have support for fetch. Meaning that if we re-write fetch-http-handler to use fetch, customers running on older versions of Node (14,16) would experience a breaking change.

What breaking change? fetch-http-handler already doesn't work with Node.js, all versions as far as I can tell. This is about fixing what appears to be a simple bug - not a rewrite. I feel that something called fetch-http-handler should not do anything weird and simply use fetch, instead of crashing with a TypeError: stream.pipe is not a function in Node as it does now.

We explicitly document that node-http-handler is intended for Node.js environments and fetch-http-handler (and xhr) for browsers. Thus, the expectation that fetch-http-handler should also function in Node.js is not aligned with its intended use.

There's nothing in that linked document that says it shouldn't be used within Node, just that there are various defaults for different environments. It would've been helpful to be explicit in the documentation that fetch-http-handler does not work outside the browser. That's non-obvious, since every major Javascript runtime/engine I know of supports fetch.

And other Javascript environments, it appears from other comments on this issue.

AWS SDK JS v3 currently only supports the Node.JS runtime. We do not guarantee compatibility with other runtimes like Bun.js.

No harm in fixing a simple bug like this, even if it accidentally helps people using other runtimes. And again, this is a bug in Node.js also.

If there is a reason why you cannot use the dedicated Node.js handler - node-http-handler please let us know so we can try and come up with a solution for you.

We have code/APM stuff that runs in multiple runtimes/environments. It generally makes our lives easier when 3rd party libraries use globally available web API functionality (e.g. fetch) instead of environment-specific APIs. That's all.

I don't mean to repeat the same answer over and over, but I think there is a miscommunication somewhere. If I still didn't quite understand the problem, feel free to add more information and I'll raise it once again with the team for further discussion.

We don't have any pressing need to fix this particular issue, just find it curious that something so simple would go unfixed for over a year, especially for the new version of this SDK that we're all getting our arms twisted to move to. It seems like there must be some weird stuff going on under the covers for this to be even a thing.

As you can see from the code I posted in the original issue, it's trivial to fix fetch-http-handler to work correctly in Node. I'm sure it would be trivial to get working in all environments that support fetch. We currently use fetch-http-handler in production in fact, with the above tweak. Ideally I'd rather not have "tweaks" in production of course, but it is what it is.

kuhe commented 4 months ago

fetch in Node.js with the AWS SDK JSv3

The default http handler in Node.js for the AWS SDK remains the NodeHttpHandler and NodeHttp2Handler (where applicable). We are incrementally introducing compatibility for fetch in Node.js, but it is currently in an early stage I would characterize as experimental.


We've added preliminary support for fetch-in-node for AWS SDK JSv3 in Node.js version 18 and higher as of version 3.575.0 of AWS SDK clients.

Though Node.js 16 at a certain version has a form of support for web streams, we are deliberately using the global webstreams available in 18+ for compatibility reasons.

I say the support is preliminary because originally the AWS SDK was built on the premise of only using node:https module family in Node.js environment. We have not tested all services and scenarios.

Limitation on streaming input with FetchHttpHandler in Node.js

// running in Node.js
import { S3 } from "@aws-sdk/client-s3";
import { FetchHttpHandler } from "@smithy/fetch-http-handler";
import { Readable } from "node:stream";

Currently, web stream as input is not yet supported due to needing to set duplex on the Request initialization. This will be addressed in a future PR.

I expect that you can still use the FetchHttpHandler in Node.js for most other cases.

// init client in Node.js with non-default http handler
const s3 = new S3({
  requestHandler: new FetchHttpHandler(),
});

working inputs

await s3.putObject({
  Bucket: "...",
  Key: "test.txt",
  ContentLength: 4,
  Body: Readable.from(Buffer.from("abcd")), // Node.js stream is ok
});

await s3.putObject({
  Bucket: "...",
  Key: "test.txt",
  Body: Buffer.from("abcd"), // Buffer or Uint8Array are ok
});

await s3.putObject({
  Bucket: "...",
  Key: "test.txt",
  ContentLength: 4,
  Body: "abcd", // string is ok
});

not yet working inputs 🚧

await s3.putObject({
  Bucket: "...",
  Key: "test.txt",
  ContentLength: 4,
  Body: new ReadableStream({ // doesn't work yet to due to missing duplex input on Request, to be addressed in future PR
    start(controller) {
      controller.enqueue(new Uint8Array([97, 98, 99, 100]));
      controller.close();
    }
  }),
});

working with outputs when using fetch in Node.js

Traditionally, using the node:https based handler, the streaming response type from e.g. AWS S3 is IncomingMessage with an SDK mixin applied that allows transformation to other types.

Now with fetch, the base type of the response stream is ReadableStream, but still with the mixin applied. You can still use the methods await getObjectResponse.Body.transformToString() -> string and await getObjectResponse.Body.transformToByteArray() -> Uint8Array etc. to collect the response stream.

const go = await s3.getObject({
  Bucket: "...",
  Key: "test.txt",
});
console.log("getObject", go.$metadata.httpStatusCode, go.Body, await go.Body.transformToString());

non-streaming i/o

For non-streaming inputs and outputs, the use of FetchHttpHandler should be basically transparent, since the output and input will be plain data JavaScript objects.

carlansley commented 4 months ago

I can confirm FetchHttpHandler now works in Node 20.13.1 and 22.1.0 with version 3.575 of the SDK. @kuhe thanks for working on this and providing excellent documentation regarding the current status.

kuhe commented 4 months ago

The duplex issue was fixed in @smithy/fetch-http-handler@v3.0.1 which can be optionally picked up with a fresh install. AWS SDK clients use a ^3.0.0 range for @smithy dependencies.

In the future the minimum version will be increased to match.

github-actions[bot] commented 3 months ago

This thread has been automatically locked since there has not been any recent activity after it was closed. Please open a new issue for related bugs and link to relevant comments in this thread.