aws / amazon-sagemaker-examples

Example 📓 Jupyter notebooks that demonstrate how to build, train, and deploy machine learning models using 🧠 Amazon SageMaker.
https://sagemaker-examples.readthedocs.io
Apache License 2.0
10.11k stars 6.77k forks source link

Unable to get output while invoking sagemaker endpoint from .net SDK #298

Closed Harathi123 closed 6 years ago

Harathi123 commented 6 years ago

Hi,

We are trying to access the sagemaker endppoint using dot net SDK. Following is the DLL we are using. AWSSDK.SageMakerRuntime

We are using MXNet_gluon_mnist example in sagemaker python SDK examples. mnist_mxnet_with_gluon.ipynb We are sending the byte array as input to the predictor from dot net core. Content type is 'Application/octet-stream

I changed the transform_fn such that it will accept byte array and convert it into numpy array. When i tried to invoke endpoint from Sagemaker by giving input as byte array, i am getting output.

But when trying to invoke the same endpoint from .net SDK, we are getting the following error. "No JSON object could be decoded"

Full Traceback: Traceback (most recent call last): File "/usr/local/lib/python2.7/dist-packages/container_support/serving.py", line 180, in _invoke self.transformer.transform(content, input_content_type, requested_output_content_type) File "/usr/local/lib/python2.7/dist-packages/mxnet_container/serve/transformer.py", line 62, in transform return self.transform_fn(self.model, input_data, content_type, accept) File "/opt/ml/code/mnist.py", line 189, in transform_fn parsed = json.loads(data) File "/usr/lib/python2.7/json/init.py", line 339, in loads return _default_decoder.decode(s) File "/usr/lib/python2.7/json/decoder.py", line 364, in decode obj, end = self.raw_decode(s, idx=_w(s, 0).end()) File "/usr/lib/python2.7/json/decoder.py", line 382, in raw_decode raise ValueError("No JSON object could be decoded")

From the logs, what i understood is it is unable to return output to the .net SDK. Because, when i tried to debug the transform_fn, its running till end json.dumps() and then again its doing ping check and running back transform_fn. And then its showing the above error.

Any suggestions will be helpful....

Thanks, Harathi

nadiaya commented 6 years ago

@Harathi123,

According to traceback the "No JSON object could be decoded" is coming from the user script.

The original transform was doing https://github.com/awslabs/amazon-sagemaker-examples/blob/master/sagemaker-python-sdk/mxnet_gluon_mnist/mnist.py#L177-L179

You mentioned you modified transform_fn function. Can you provide the modified user script you used? The full code including SDK calls would be even more helpful.

Harathi123 commented 6 years ago

Hi @nadiaya

Thanks for the reply Please find the transform_fn below:

def transform_fn(net, data, input_content_type, output_content_type):
    """
    Transform a request using the Gluon model. Called once per request.

    :param net: The Gluon model.
    :param data: The request payload.
    :param input_content_type: The request content type.
    :param output_content_type: The (desired) response content type.
    :return: response payload and content type.
    """
    # we can use content types to vary input/output handling, but
    # here we just assume json for both
    print('1')
    parsed = json.loads(data)
    print('2')
    print(type(parsed))
    b_arr = bytearray(parsed)
    print('3')
    arr = cv2.imdecode(np.frombuffer(b_arr, dtype=np.uint8), 0)
    print('4')
    print(arr.shape)
    thresh, thresh_img = cv2.threshold(arr,127,255,cv2.THRESH_BINARY)
    print('5')
    img = 1-(thresh_img/255)
    print('6')
    img = np.expand_dims(img, axis = 0)
    print('7')
    nda = mx.nd.array(img)
    print('8')
    output = net(nda)
    print('9')
    prediction = mx.nd.argmax(output, axis=1)
    print(prediction)
    response_body = json.dumps(prediction.asnumpy().tolist()[0])
    return response_body, output_content_type

I just added print statements to debug..

Thanks, Harathi

Harathi123 commented 6 years ago

@nadiaya, also please find the screenshots for the logs. By seeing logs, it is evident that the transform_fn function is running till end, but unable to return output (it seems). So, its again doing ping check and running transform function from the beginning.

2018-06-29 2018-06-29 1

))

nadiaya commented 6 years ago

@Harathi123

From the description it looks like you are passing content type ''Application/octet-stream" but the transform function expects JSON.

Would it be possible to see the code that invokes the endpoint? And the data that is passed to it as well?

Harathi123 commented 6 years ago

@nadiaya, thanks for replying

Yes we tried with both content types 'Application/octet-stream' and 'Application/JSON'. We are trying to invoke endpoint from dot net SDK. The code used to invoke endpoint is as follows:

byte[] content = System.IO.File.ReadAllBytes("E:\test.png"); Amazon.SageMakerRuntime.Model.InvokeEndpointRequest request = new Amazon.SageMakerRuntime.Model.InvokeEndpointRequest(); request.EndpointName = "endpoint name"; request.ContentType = "application/octet-stream"; request.Body = new MemoryStream(content); AmazonSageMakerRuntimeClient awsClient = new AmazonSageMakerRuntimeClient("access-key", "secret-key", region); Amazon.SageMakerRuntime.Model.InvokeEndpointResponse response = await awsClient.InvokeEndpointAsync(request);

string predictions = Encoding.UTF8.GetString(response.Body.ToArray());

We tried with 'Application/JSON' also. From the logs, it seems the input got loaded to transform function. But getting error while returing output.

Thanks, Harathi

nadiaya commented 6 years ago

Could you print the data you get in the transform_fn to see what format is it?

def transform_fn(net, data, input_content_type, output_content_type):
    """
    Transform a request using the Gluon model. Called once per request.

    :param net: The Gluon model.
    :param data: The request payload.
    :param input_content_type: The request content type.
    :param output_content_type: The (desired) response content type.
    :return: response payload and content type.
    """
    # we can use content types to vary input/output handling, but
    # here we just assume json for both
    print('1')
    print('data: {}'.format(data))
    print('input_content_type: {}'.format(input_content_type))
    print('output_content_type: {}'.format(output_content_type))
    parsed = json.loads(data)
    print('2')
    # the rest of the transform_fn function

And post what it looks like here?

Harathi123 commented 6 years ago

Sorry @nadiaya , the person who is trying to invoke from dot net SDK is working from India. But when i tried to invoke it from sagemaker, i am getting output.

Harathi123 commented 6 years ago

We are sending input as follows:

Input = list(b'\x89PNG\r\n\x1a\n\x00\x00\x00\rIHDR\x00……………………………’) # bytearray of image The input looks like this: [137, 80, 78, 71, 13, 10, 26, 10,……………………………]

Thanks, Harathi

nadiaya commented 6 years ago

But when i tried to invoke it from sagemaker, i am getting output.

Can you clarify what did you mean? Are you able to run predictions using the same endpoint?

Harathi123 commented 6 years ago

Yes @nadiaya , i am able to run predictions using the endpoint from Sagemaker 'predict' method. But while trying to access same endpoint from dot net SDK, we are getting that error.

nadiaya commented 6 years ago

So, it seems the problem is that content is not actually serialized into JSON when you send it using .net SDK. You have to do it explicitly when using AWSSDK.SageMakerRuntime.

PythonSDK for example handles this serialization for you: https://github.com/aws/sagemaker-python-sdk/blob/master/src/sagemaker/predictor.py#L59-L94

Harathi123 commented 6 years ago

Yes it seems it is the problem. It means with .net SDK, we need to JSON serialize data before giving it as input. Am i right?

Thanks, Harathi

nadiaya commented 6 years ago

Yes, this is correct. The SageMaker PythonSDK is wrapper on top regular Python AWSSDK (boto) that provides additional functionality to make users life easier.

Harathi123 commented 6 years ago

Ok @nadiaya, thanks for your valuble time and suggestions

nadiaya commented 6 years ago

I am going to close this issue. Please feel free to reopen it if you have any further questions or problems.