aws-samples / amazon-a2i-sample-jupyter-notebooks

Sample Jupyter Notebooks for Amazon Augmented AI (A2I)
https://aws.amazon.com/augmented-ai/
Apache License 2.0
67 stars 64 forks source link

Unable to run Amazon Augmented AI (A2I) and SageMaker Endpoint locally #5

Closed papagala closed 3 years ago

papagala commented 4 years ago

It's a bit unclear how to run this locally, but managed to bypass errors around credentials not being found by passing a boto3.Session to the sagemaker.Session that calls one of my "profile_name"s.

However, I'm getting this error everytime I try to make a prediction. Specifically when I run

object_detector = sagemaker.predictor.RealTimePredictor(endpoint=endpoint_name,sagemaker_session=sess)
with open(test_photos[2], 'rb') as image:
    f = image.read()
    b = bytearray(f)
results = object_detector.predict(b)

I get

---------------------------------------------------------------------------
BrokenPipeError                           Traceback (most recent call last)
~/anaconda3/lib/python3.7/site-packages/urllib3/connectionpool.py in urlopen(self, method, url, body, headers, retries, redirect, assert_same_host, timeout, pool_timeout, release_conn, chunked, body_pos, **response_kw)
    599                                                   body=body, headers=headers,
--> 600                                                   chunked=chunked)
    601 

~/anaconda3/lib/python3.7/site-packages/urllib3/connectionpool.py in _make_request(self, conn, method, url, timeout, chunked, **httplib_request_kw)
    353         else:
--> 354             conn.request(method, url, **httplib_request_kw)
    355 

~/anaconda3/lib/python3.7/http/client.py in request(self, method, url, body, headers, encode_chunked)
   1228         """Send a complete request to the server."""
-> 1229         self._send_request(method, url, body, headers, encode_chunked)
   1230 

~/anaconda3/lib/python3.7/site-packages/botocore/awsrequest.py in _send_request(self, method, url, body, headers, *args, **kwargs)
     91         rval = super(AWSConnection, self)._send_request(
---> 92             method, url, body, headers, *args, **kwargs)
     93         self._expect_header_set = False

~/anaconda3/lib/python3.7/http/client.py in _send_request(self, method, url, body, headers, encode_chunked)
   1274             body = _encode(body, 'body')
-> 1275         self.endheaders(body, encode_chunked=encode_chunked)
   1276 

~/anaconda3/lib/python3.7/http/client.py in endheaders(self, message_body, encode_chunked)
   1223             raise CannotSendHeader()
-> 1224         self._send_output(message_body, encode_chunked=encode_chunked)
   1225 

~/anaconda3/lib/python3.7/site-packages/botocore/awsrequest.py in _send_output(self, message_body, *args, **kwargs)
    142             # we must run the risk of Nagle.
--> 143             self.send(message_body)
    144 

~/anaconda3/lib/python3.7/site-packages/botocore/awsrequest.py in send(self, str)
    202             return
--> 203         return super(AWSConnection, self).send(str)
    204 

~/anaconda3/lib/python3.7/http/client.py in send(self, data)
    976         try:
--> 977             self.sock.sendall(data)
    978         except TypeError:

~/anaconda3/lib/python3.7/ssl.py in sendall(self, data, flags)
   1014                 while count < amount:
-> 1015                     v = self.send(byte_view[count:])
   1016                     count += v

~/anaconda3/lib/python3.7/ssl.py in send(self, data, flags)
    983                     self.__class__)
--> 984             return self._sslobj.write(data)
    985         else:

BrokenPipeError: [Errno 32] Broken pipe

During handling of the above exception, another exception occurred:

ProtocolError                             Traceback (most recent call last)
~/anaconda3/lib/python3.7/site-packages/botocore/httpsession.py in send(self, request)
    262                 decode_content=False,
--> 263                 chunked=self._chunked(request.headers),
    264             )

~/anaconda3/lib/python3.7/site-packages/urllib3/connectionpool.py in urlopen(self, method, url, body, headers, retries, redirect, assert_same_host, timeout, pool_timeout, release_conn, chunked, body_pos, **response_kw)
    637             retries = retries.increment(method, url, error=e, _pool=self,
--> 638                                         _stacktrace=sys.exc_info()[2])
    639             retries.sleep()

~/anaconda3/lib/python3.7/site-packages/urllib3/util/retry.py in increment(self, method, url, response, error, _pool, _stacktrace)
    343             # Disabled, indicate to re-raise the error.
--> 344             raise six.reraise(type(error), error, _stacktrace)
    345 

~/anaconda3/lib/python3.7/site-packages/urllib3/packages/six.py in reraise(tp, value, tb)
    684         if value.__traceback__ is not tb:
--> 685             raise value.with_traceback(tb)
    686         raise value

~/anaconda3/lib/python3.7/site-packages/urllib3/connectionpool.py in urlopen(self, method, url, body, headers, retries, redirect, assert_same_host, timeout, pool_timeout, release_conn, chunked, body_pos, **response_kw)
    599                                                   body=body, headers=headers,
--> 600                                                   chunked=chunked)
    601 

~/anaconda3/lib/python3.7/site-packages/urllib3/connectionpool.py in _make_request(self, conn, method, url, timeout, chunked, **httplib_request_kw)
    353         else:
--> 354             conn.request(method, url, **httplib_request_kw)
    355 

~/anaconda3/lib/python3.7/http/client.py in request(self, method, url, body, headers, encode_chunked)
   1228         """Send a complete request to the server."""
-> 1229         self._send_request(method, url, body, headers, encode_chunked)
   1230 

~/anaconda3/lib/python3.7/site-packages/botocore/awsrequest.py in _send_request(self, method, url, body, headers, *args, **kwargs)
     91         rval = super(AWSConnection, self)._send_request(
---> 92             method, url, body, headers, *args, **kwargs)
     93         self._expect_header_set = False

~/anaconda3/lib/python3.7/http/client.py in _send_request(self, method, url, body, headers, encode_chunked)
   1274             body = _encode(body, 'body')
-> 1275         self.endheaders(body, encode_chunked=encode_chunked)
   1276 

~/anaconda3/lib/python3.7/http/client.py in endheaders(self, message_body, encode_chunked)
   1223             raise CannotSendHeader()
-> 1224         self._send_output(message_body, encode_chunked=encode_chunked)
   1225 

~/anaconda3/lib/python3.7/site-packages/botocore/awsrequest.py in _send_output(self, message_body, *args, **kwargs)
    142             # we must run the risk of Nagle.
--> 143             self.send(message_body)
    144 

~/anaconda3/lib/python3.7/site-packages/botocore/awsrequest.py in send(self, str)
    202             return
--> 203         return super(AWSConnection, self).send(str)
    204 

~/anaconda3/lib/python3.7/http/client.py in send(self, data)
    976         try:
--> 977             self.sock.sendall(data)
    978         except TypeError:

~/anaconda3/lib/python3.7/ssl.py in sendall(self, data, flags)
   1014                 while count < amount:
-> 1015                     v = self.send(byte_view[count:])
   1016                     count += v

~/anaconda3/lib/python3.7/ssl.py in send(self, data, flags)
    983                     self.__class__)
--> 984             return self._sslobj.write(data)
    985         else:

ProtocolError: ('Connection aborted.', BrokenPipeError(32, 'Broken pipe'))

During handling of the above exception, another exception occurred:

ConnectionClosedError                     Traceback (most recent call last)
<ipython-input-13-ea07fcaa5d16> in <module>
      3     f = image.read()
      4     b = bytearray(f)
----> 5 results = object_detector.predict(b,)

~/anaconda3/lib/python3.7/site-packages/sagemaker/predictor.py in predict(self, data, initial_args, target_model)
    108 
    109         request_args = self._create_request_args(data, initial_args, target_model)
--> 110         response = self.sagemaker_session.sagemaker_runtime_client.invoke_endpoint(**request_args)
    111         return self._handle_response(response)
    112 

~/anaconda3/lib/python3.7/site-packages/botocore/client.py in _api_call(self, *args, **kwargs)
    314                     "%s() only accepts keyword arguments." % py_operation_name)
    315             # The "self" in this scope is referring to the BaseClient.
--> 316             return self._make_api_call(operation_name, kwargs)
    317 
    318         _api_call.__name__ = str(py_operation_name)

~/anaconda3/lib/python3.7/site-packages/botocore/client.py in _make_api_call(self, operation_name, api_params)
    620         else:
    621             http, parsed_response = self._make_request(
--> 622                 operation_model, request_dict, request_context)
    623 
    624         self.meta.events.emit(

~/anaconda3/lib/python3.7/site-packages/botocore/client.py in _make_request(self, operation_model, request_dict, request_context)
    639     def _make_request(self, operation_model, request_dict, request_context):
    640         try:
--> 641             return self._endpoint.make_request(operation_model, request_dict)
    642         except Exception as e:
    643             self.meta.events.emit(

~/anaconda3/lib/python3.7/site-packages/botocore/endpoint.py in make_request(self, operation_model, request_dict)
    100         logger.debug("Making request for %s with params: %s",
    101                      operation_model, request_dict)
--> 102         return self._send_request(request_dict, operation_model)
    103 
    104     def create_request(self, params, operation_model=None):

~/anaconda3/lib/python3.7/site-packages/botocore/endpoint.py in _send_request(self, request_dict, operation_model)
    135             request, operation_model, context)
    136         while self._needs_retry(attempts, operation_model, request_dict,
--> 137                                 success_response, exception):
    138             attempts += 1
    139             # If there is a stream associated with the request, we need

~/anaconda3/lib/python3.7/site-packages/botocore/endpoint.py in _needs_retry(self, attempts, operation_model, request_dict, response, caught_exception)
    254             event_name, response=response, endpoint=self,
    255             operation=operation_model, attempts=attempts,
--> 256             caught_exception=caught_exception, request_dict=request_dict)
    257         handler_response = first_non_none_response(responses)
    258         if handler_response is None:

~/anaconda3/lib/python3.7/site-packages/botocore/hooks.py in emit(self, event_name, **kwargs)
    354     def emit(self, event_name, **kwargs):
    355         aliased_event_name = self._alias_event_name(event_name)
--> 356         return self._emitter.emit(aliased_event_name, **kwargs)
    357 
    358     def emit_until_response(self, event_name, **kwargs):

~/anaconda3/lib/python3.7/site-packages/botocore/hooks.py in emit(self, event_name, **kwargs)
    226                  handlers.
    227         """
--> 228         return self._emit(event_name, kwargs)
    229 
    230     def emit_until_response(self, event_name, **kwargs):

~/anaconda3/lib/python3.7/site-packages/botocore/hooks.py in _emit(self, event_name, kwargs, stop_on_response)
    209         for handler in handlers_to_call:
    210             logger.debug('Event %s: calling handler %s', event_name, handler)
--> 211             response = handler(**kwargs)
    212             responses.append((handler, response))
    213             if stop_on_response and response is not None:

~/anaconda3/lib/python3.7/site-packages/botocore/retryhandler.py in __call__(self, attempts, response, caught_exception, **kwargs)
    181 
    182         """
--> 183         if self._checker(attempts, response, caught_exception):
    184             result = self._action(attempts=attempts)
    185             logger.debug("Retry needed, action of: %s", result)

~/anaconda3/lib/python3.7/site-packages/botocore/retryhandler.py in __call__(self, attempt_number, response, caught_exception)
    249     def __call__(self, attempt_number, response, caught_exception):
    250         should_retry = self._should_retry(attempt_number, response,
--> 251                                           caught_exception)
    252         if should_retry:
    253             if attempt_number >= self._max_attempts:

~/anaconda3/lib/python3.7/site-packages/botocore/retryhandler.py in _should_retry(self, attempt_number, response, caught_exception)
    275             # If we've exceeded the max attempts we just let the exception
    276             # propogate if one has occurred.
--> 277             return self._checker(attempt_number, response, caught_exception)
    278 
    279 

~/anaconda3/lib/python3.7/site-packages/botocore/retryhandler.py in __call__(self, attempt_number, response, caught_exception)
    315         for checker in self._checkers:
    316             checker_response = checker(attempt_number, response,
--> 317                                        caught_exception)
    318             if checker_response:
    319                 return checker_response

~/anaconda3/lib/python3.7/site-packages/botocore/retryhandler.py in __call__(self, attempt_number, response, caught_exception)
    221         elif caught_exception is not None:
    222             return self._check_caught_exception(
--> 223                 attempt_number, caught_exception)
    224         else:
    225             raise ValueError("Both response and caught_exception are None.")

~/anaconda3/lib/python3.7/site-packages/botocore/retryhandler.py in _check_caught_exception(self, attempt_number, caught_exception)
    357         # the MaxAttemptsDecorator is not interested in retrying the exception
    358         # then this exception just propogates out past the retry code.
--> 359         raise caught_exception

~/anaconda3/lib/python3.7/site-packages/botocore/endpoint.py in _do_get_response(self, request, operation_model)
    198             http_response = first_non_none_response(responses)
    199             if http_response is None:
--> 200                 http_response = self._send(request)
    201         except HTTPClientError as e:
    202             return (None, e)

~/anaconda3/lib/python3.7/site-packages/botocore/endpoint.py in _send(self, request)
    267 
    268     def _send(self, request):
--> 269         return self.http_session.send(request)
    270 
    271 

~/anaconda3/lib/python3.7/site-packages/botocore/httpsession.py in send(self, request)
    292                 error=e,
    293                 request=request,
--> 294                 endpoint_url=request.url
    295             )
    296         except Exception as e:

ConnectionClosedError: Connection was closed before we received a valid response from endpoint URL: "https://runtime.sagemaker.us-east-1.amazonaws.com/endpoints/DEMO-object-detection-augmented-ai-2020-05-19-00-18-57/invocations".

If I run results = object_detector.predict("dummy text")

I get something I can understand and expect:

---------------------------------------------------------------------------
ModelError                                Traceback (most recent call last)
<ipython-input-22-dd7ea88a4b12> in <module>
      3     f = image.read()
      4     b = bytearray(f)
----> 5 results = object_detector.predict("dummy tex")

~/anaconda3/lib/python3.7/site-packages/sagemaker/predictor.py in predict(self, data, initial_args, target_model)
    108 
    109         request_args = self._create_request_args(data, initial_args, target_model)
--> 110         response = self.sagemaker_session.sagemaker_runtime_client.invoke_endpoint(**request_args)
    111         return self._handle_response(response)
    112 

~/anaconda3/lib/python3.7/site-packages/botocore/client.py in _api_call(self, *args, **kwargs)
    314                     "%s() only accepts keyword arguments." % py_operation_name)
    315             # The "self" in this scope is referring to the BaseClient.
--> 316             return self._make_api_call(operation_name, kwargs)
    317 
    318         _api_call.__name__ = str(py_operation_name)

~/anaconda3/lib/python3.7/site-packages/botocore/client.py in _make_api_call(self, operation_name, api_params)
    633             error_code = parsed_response.get("Error", {}).get("Code")
    634             error_class = self.exceptions.from_code(error_code)
--> 635             raise error_class(parsed_response, operation_name)
    636         else:
    637             return parsed_response

ModelError: An error occurred (ModelError) when calling the InvokeEndpoint operation: Received client error (400) from model with message "unable to evaluate payload provided". See https://us-east-1.console.aws.amazon.com/cloudwatch/home?region=us-east-1#logEventViewer:group=/aws/sagemaker/Endpoints/DEMO-object-detection-augmented-ai-2020-05-19-00-18-57 in account 488507749156 for more information.

Name: boto3 Version: 1.13.12

Name: sagemaker Version: 1.58.2

Name: botocore Version: 1.16.12

Python 3.7.3

papagala commented 4 years ago

I think I found the solution. But I still think it's a bug. The files seem to be too large. I'm using smaller sized pics and is working fine.

The error message you get back, says nothing about it which is confusing.

papagala commented 4 years ago

I compressed all images and the problem went away pexels-photo-276517 pexels-photo-980382 pexels-photo-1571457

papagala commented 4 years ago

Ok, I discovered something else. Looks like if you rerun the notebook, the images double, triple in size (they become n times large with n the number of times you run the notebook).

This fixes that problem:

for ind in test_photos_index:
    !rm sample-a2i-images/pexels-photo-{ind}.jpeg

Maybe with bash magic we can check if the file already exists to NOT redownload it, but this is fine for me

michaelhsieh42 commented 4 years ago

Hello @papagala, thanks for your feedback. Indeed the SageMaker error

ConnectionClosedError: Connection was closed before we received a valid response from endpoint URL: "https://runtime.sagemaker.us-east-1.amazonaws.com/endpoints/DEMO-object-detection-augmented-ai-2020-05-19-00-18-57/invocations".

could stem from image payload to the endpoint exceeding the limit 5MB. The original files are around 2-3 MB. Thanks for reporting that the curl would append the image and grow the file size. It's something we can address easily.

We will also address the description for running environment. Thanks for your feedback again.

papagala commented 4 years ago

Thanks a lot!

samuel-henry commented 3 years ago

Closing based on michaelhsieh42@'s merge

papagala commented 3 years ago

Thank you all!