aws / aws-sdk-js-v3

Modularized AWS SDK for JavaScript.
Apache License 2.0
2.97k stars 556 forks source link

Need help with Sagemaker streaming using @aws-sdk/client-sagemaker-runtime #5557

Open sthuku opened 7 months ago

sthuku commented 7 months ago

Describe the issue

Hello,

I'm trying to stream from Sagemaker api using InvokeEndpointWithResponseStreamCommand (AWSJavaScriptSDK/v3) from @aws-sdk/client-sagemaker-runtime. The response payload that we get for the InvokeEndpointWithResponseStreamCommand doesn't match the one from official docs. If the official documentation is outdated, could someone please update not only with the right response structure but also with an example on handling the streaming chunks from the response? Also I'd appreciate if any guidance to handle this unexpected response properly to handle and decode the streaming chunks in the right way.

what I see is this

{
    "$metadata": {
        "httpStatusCode":200,
        "requestId":"abcd-1",
        "attempts":1,
        "totalRetryDelay":0
    },
    "ContentType":"application/json",
    "InvokedProductionVariant":"abcd",
    "Body": {
        "options": {
            "messageStream": {
                "options": {
                    "inputStream": {
                    },
                    "decoder": {
                        "headerMarshaller": {
                        },
                        "messageBuffer": [
                        ],
                        "isEndOfStream":false
                    }
                }
            }
        }
    }
}

the payload structure stated on official docs

// { // InvokeEndpointWithResponseStreamOutput
//   Body: { // ResponseStream Union: only one key present
//     PayloadPart: { // PayloadPart
//       Bytes: "BLOB_VALUE",
//     },
//     ModelStreamError: { // ModelStreamError
//       Message: "STRING_VALUE",
//       ErrorCode: "STRING_VALUE",
//     },
//     InternalStreamFailure: { // InternalStreamFailure
//       Message: "STRING_VALUE",
//     },
//   },
//   ContentType: "STRING_VALUE",
//   InvokedProductionVariant: "STRING_VALUE",
//   CustomAttributes: "STRING_VALUE",
// };

This is how we hit SM streaming endpoint

export const createSageMakerStreamingClient = async (payload) => {
    const client = new SageMakerRuntimeClient({
      region: 'us-west-2',
      credentials: (await fromTemporaryCredentials({
        params: { RoleArn: process.env.SM_INVOKE_ROLE },
        clientConfig: { region: 'us-west-2' }
      })())
    });

    const encoder = new TextEncoder();
    const body = encoder.encode(JSON.stringify({
      version: 1,
      instances: [{
        messages: payload.messages,
        configuration: payload.configuration
      }]
    }));

    const params = {
      EndpointName: 'abcd',
      Body: body,
      ContentType: 'application/json',
      Accept: 'application/json'
    }

    const command = new InvokeEndpointWithResponseStreamCommand(params);
    return await client.send(command);
}

when we try to handle the response that we get, we see some strange characters on decoded text

const sm = await createSageMakerStreamingClient(JSON.parse(body));
  console.log(JSON.stringify(sm));

  const { inputStream, decoder } = sm.Body.options.messageStream.options;
  for await (const chunk of inputStream) {
    const jsonString = Buffer.from(chunk).toString('utf8')
    console.log(decoder(chunk), '@@@@@stream');
  }

this is what we see after decoding the chunk on top of weird response we get note: The strange characters not showing up when we use python client to stream note: we also used decoder.headerMarshaller.toUtf8(chunk); from streaming response to decode, the same strange characters appear. Additionally, also used new TextDecoder()

mY7��
     :event-type
:message-typeeventThe g�� @@@@@stream
oYl�֥
     :event-type
:message-typeeventimage o�f @@@@@ream
rY?
    :event-type
:message-typeeventfeatures �qS @@@@@m
kY�wpe
      :event-type
:message-typeeventa .#Hk @@@@@-stream
pY�G��
      :event-type
:message-typeeventsilver D�� @@@@@eam
mY7��

Links

https://docs.aws.amazon.com/AWSJavaScriptSDK/v3/latest/client/sagemaker-runtime/command/InvokeEndpointWithResponseStreamCommand/

aBurmeseDev commented 6 months ago

Hi @sthuku - thanks for reaching out.

The output you're getting actually matches the one mentioned in the doc if you scroll down the page to "InvokeEndpointWithResponseStreamCommand Output". Screenshot 2023-12-11 at 12 26 12 PM

Let me know if you're able to see it.

I'm looking more into finding the best way to encode the response for you.

sthuku commented 6 months ago

Screenshot 2023-12-11 at 1 48 20 PM

The comment on the example is different as shown on the screenshot, could you please update the example to show how to handle MediaStream to read chunks? As we're seeing weird strange characters on the utf-8 decoded chunk as part of our handling logic (see the original question), wondering if we're doing anything incorrect to cause these strange characters. Like I said in the original question, the strange characters wont appear on handling of the response using python logic which makes us think that this issue is on JS Sagemaker client side.

body = {"inputs": "what is life", "parameters": {"max_new_tokens":400}}
resp = smr.invoke_endpoint_with_response_stream(EndpointName=endpoint_name, Body=json.dumps(body), ContentType="application/json")
event_stream = resp['Body']

for line in LineIterator(event_stream):
    resp = json.loads(line)
    print(resp.get("outputs")[0], end='')