Batch Transform function starts sending image inference requests before model is actually loaded

Describe the bug When I use the batch transform method to invoke a custom TensorFlow model (model.tar.gz), it appears the requests to process images start before the actual model file is loaded (180MB file).

Expected behavior I expect the model to load and then the batch processing of images to proceed.

System information

Framework version: TF 2.4.1
Python version: 3.8
Custom Docker image (Y/N): N

I have a TensorFlow 2.4.1 model in S3 as per the model.tar.gz formats. The model.tar.tz includes inference.py to handle preparing an image payload for the model using input_handler. A requirements.txt file states that the json5 package should be loaded. It appears that all of this starts up correctly.

In my python code, I use:

import numpy as np
import os
import boto3
import sagemaker
from sagemaker import get_execution_role
from sagemaker.tensorflow import TensorFlowModel

sagemaker_session = sagemaker.Session()
try:
    role = get_execution_role()
except ValueError:
    iam = boto3.client('iam')
    role = iam.get_role(RoleName='Sagemaker')['Role']['Arn']

region = sagemaker_session.boto_region_name
bucket = sagemaker_session.default_bucket()
prefix = 'cvmodel'
print('Region: {}'.format(region))
print('S3 URI: s3://{}/{}'.format(bucket, prefix))
print('Role:   {}'.format(role))

model = TensorFlowModel(model_data='s3://cvmodel/model.tar.gz',
                            role=role,
                            entry_point='inference.py',
                            framework_version="2.4.1"
                            )
transformer = model.transformer(instance_count=1,
                          instance_type='ml.m4.xlarge',
                          max_concurrent_transforms=1,
                          max_payload=1,
                          output_path='s3://cvmodel/results')
transformer.transform('s3://cvmodel/images', content_type='application/x-image')

**The inference.py input_handler looks like this:**

import base64
import io
import json
import requests

def input_handler(data, context):
    """ Pre-process request input before it is sent to TensorFlow Serving REST API
    Args:
        data (obj): the request data stream
        context (Context): an object containing request and configuration details
    Returns:
        (dict): a JSON-serializable dict that contains request body and headers
    """

    print('input handler called')
    if context.request_content_type == 'application/x-image':
        payload = data.read()
        encoded_image = base64.b64encode(payload).decode('utf-8')
        instance = [{"b64": encoded_image}]
        print('json image produced')
        return json.dumps({"instances": instance})
    else:
        _return_error(415, 'Unsupported content type "{}"'.format(context.request_content_type or 'Unknown'))

This is the log I receive after called the transform function and waiting about 4 minutes:

What is the problem here?

.........................INFO:__main__:starting services
INFO:tfs_utils:using default model name: Servo
INFO:tfs_utils:tensorflow serving model config: 
model_config_list: {
  config: {
    name: "Servo",
    base_path: "/opt/ml/model/export/Servo",
    model_platform: "tensorflow"
  }
}

INFO:__main__:using default model name: Servo
INFO:__main__:tensorflow serving model config: 
model_config_list: {
  config: {
    name: "Servo",
    base_path: "/opt/ml/model/export/Servo",
    model_platform: "tensorflow"
  }
}

INFO:__main__:tensorflow version info:
2021-03-14 23:56:24.251936: W external/org_tensorflow/tensorflow/core/profiler/internal/smprofiler_timeline.cc:460] Initializing the SageMaker Profiler.
2021-03-14 23:56:24.252074: W external/org_tensorflow/tensorflow/core/profiler/internal/smprofiler_timeline.cc:105] SageMaker Profiler is not enabled. The timeline writer thread will not be started, future recorded events will be dropped.
TensorFlow ModelServer: 2.4.0-rc4+dev.sha.no_git
TensorFlow Library: 2.4.1
INFO:__main__:tensorflow serving command: tensorflow_model_server --port=10000 --rest_api_port=10001 --model_config_file=/sagemaker/model-config.cfg --max_num_load_retries=0 
INFO:__main__:started tensorflow serving (pid: 12)
INFO:__main__:nginx config: 
load_module modules/ngx_http_js_module.so;

worker_processes auto;
daemon off;
pid /tmp/nginx.pid;
error_log  /dev/stderr error;

worker_rlimit_nofile 4096;

events {
  worker_connections 2048;
}

http {
  include /etc/nginx/mime.types;
  default_type application/json;
  access_log /dev/stdout combined;
  js_include tensorflow-serving.js;

  upstream tfs_upstream {
    server localhost:10001;
  }

  upstream gunicorn_upstream {
    server unix:/tmp/gunicorn.sock fail_timeout=1;
  }

  server {
    listen 8080 deferred;
    client_max_body_size 0;
    client_body_buffer_size 100m;
    subrequest_output_buffer_size 100m;

    set $tfs_version 2.4;
    set $default_tfs_model Servo;

    location /tfs {
        rewrite ^/tfs/(.*) /$1  break;
        proxy_redirect off;
        proxy_pass_request_headers off;
        proxy_set_header Content-Type 'application/json';
        proxy_set_header Accept 'application/json';
        proxy_pass http://tfs_upstream;
    }

    location /ping {
        proxy_pass http://gunicorn_upstream/ping;
    }

    location /invocations {
        proxy_pass http://gunicorn_upstream/invocations;
    }

    location /models {
        proxy_pass http://gunicorn_upstream/models;
    }

    location / {
        return 404 '{"error": "Not Found"}';
    }

    keepalive_timeout 3;
  }
}

INFO:__main__:installing packages from requirements.txt...
2021-03-14 23:56:24.637383: W external/org_tensorflow/tensorflow/core/profiler/internal/smprofiler_timeline.cc:460] Initializing the SageMaker Profiler.
2021-03-14 23:56:24.637508: W external/org_tensorflow/tensorflow/core/profiler/internal/smprofiler_timeline.cc:105] SageMaker Profiler is not enabled. The timeline writer thread will not be started, future recorded events will be dropped.
2021-03-14 23:56:24.640239: I tensorflow_serving/model_servers/server_core.cc:464] Adding/updating models.
2021-03-14 23:56:24.640284: I tensorflow_serving/model_servers/server_core.cc:587]  (Re-)adding model: Servo
2021-03-14 23:56:24.740563: I tensorflow_serving/util/retrier.cc:46] Retrying of Reserving resources for servable: {name: Servo version: 1} exhausted max_num_retries: 0
2021-03-14 23:56:24.740603: I tensorflow_serving/core/basic_manager.cc:740] Successfully reserved resources to load servable {name: Servo version: 1}
2021-03-14 23:56:24.740626: I tensorflow_serving/core/loader_harness.cc:66] Approving load for servable version {name: Servo version: 1}
2021-03-14 23:56:24.740645: I tensorflow_serving/core/loader_harness.cc:74] Loading servable version {name: Servo version: 1}
2021-03-14 23:56:24.740723: I external/org_tensorflow/tensorflow/cc/saved_model/reader.cc:32] Reading SavedModel from: /opt/ml/model/export/Servo/1
2021-03-14 23:56:24.888950: I external/org_tensorflow/tensorflow/cc/saved_model/reader.cc:55] Reading meta graph with tags { serve }
2021-03-14 23:56:24.889008: I external/org_tensorflow/tensorflow/cc/saved_model/reader.cc:93] Reading SavedModel debug info (if present) from: /opt/ml/model/export/Servo/1
2021-03-14 23:56:24.889602: I external/org_tensorflow/tensorflow/core/common_runtime/process_util.cc:146] Creating new thread pool with default inter op setting: 2. Tune using inter_op_parallelism_threads for best performance.
Collecting json5
  Downloading json5-0.9.5-py2.py3-none-any.whl (17 kB)
2021-03-14 23:56:25.373645: I external/org_tensorflow/tensorflow/cc/saved_model/loader.cc:206] Restoring SavedModel bundle.
Installing collected packages: json5
Successfully installed json5-0.9.5
2021-03-14 23:56:25.435171: I external/org_tensorflow/tensorflow/core/platform/profile_utils/cpu_utils.cc:112] CPU Frequency: 2300050000 Hz
INFO:__main__:gunicorn command: gunicorn -b unix:/tmp/gunicorn.sock -k gevent --chdir /sagemaker --pythonpath /opt/ml/model/code -e TFS_GRPC_PORT=10000 -e SAGEMAKER_MULTI_MODEL=False -e SAGEMAKER_SAFE_PORT_RANGE=10000-10999 python_service:app
INFO:__main__:gunicorn version info:
gunicorn (version 20.0.4)
INFO:__main__:started gunicorn (pid: 47)
[2021-03-14 23:56:26 +0000] [47] [INFO] Starting gunicorn 20.0.4
[2021-03-14 23:56:26 +0000] [47] [INFO] Listening at: unix:/tmp/gunicorn.sock (47)
INFO:__main__:gunicorn server is ready!
[2021-03-14 23:56:26 +0000] [47] [INFO] Using worker: gevent
[2021-03-14 23:56:26 +0000] [51] [INFO] Booting worker with pid: 51
INFO:__main__:nginx version info:
nginx version: nginx/1.18.0
built by gcc 7.5.0 (Ubuntu 7.5.0-3ubuntu1~18.04) 
built with OpenSSL 1.1.1  11 Sep 2018
TLS SNI support enabled
configure arguments: --prefix=/etc/nginx --sbin-path=/usr/sbin/nginx --modules-path=/usr/lib/nginx/modules --conf-path=/etc/nginx/nginx.conf --error-log-path=/var/log/nginx/error.log --http-log-path=/var/log/nginx/access.log --pid-path=/var/run/nginx.pid --lock-path=/var/run/nginx.lock --http-client-body-temp-path=/var/cache/nginx/client_temp --http-proxy-temp-path=/var/cache/nginx/proxy_temp --http-fastcgi-temp-path=/var/cache/nginx/fastcgi_temp --http-uwsgi-temp-path=/var/cache/nginx/uwsgi_temp --http-scgi-temp-path=/var/cache/nginx/scgi_temp --user=nginx --group=nginx --with-compat --with-file-aio --with-threads --with-http_addition_module --with-http_auth_request_module --with-http_dav_module --with-http_flv_module --with-http_gunzip_module --with-http_gzip_static_module --with-http_mp4_module --with-http_random_index_module --with-http_realip_module --with-http_secure_link_module --with-http_slice_module --with-http_ssl_module --with-http_stub_status_module --with-http_sub_module --with-http_v2_module --with-mail --with-mail_ssl_module --with-stream --with-stream_realip_module --with-stream_ssl_module --with-stream_ssl_preread_module --with-cc-opt='-g -O2 -fdebug-prefix-map=/data/builder/debuild/nginx-1.18.0/debian/debuild-base/nginx-1.18.0=. -fstack-protector-strong -Wformat -Werror=format-security -Wp,-D_FORTIFY_SOURCE=2 -fPIC' --with-ld-opt='-Wl,-Bsymbolic-functions -Wl,-z,relro -Wl,-z,now -Wl,--as-needed -pie'
INFO:__main__:started nginx (pid: 52)
169.254.255.130 - - [14/Mar/2021:23:56:26 +0000] "GET /ping HTTP/1.1" 200 0 "-" "Go-http-client/1.1"
169.254.255.130 - - [14/Mar/2021:23:56:26 +0000] "GET /execution-parameters HTTP/1.1" 404 22 "-" "Go-http-client/1.1"
INFO:python_service:http://gunicorn_upstream/invocations
INFO:tfs_utils:sagemaker tfs attributes: 
{}
input handler called
json image produced
ERROR:python_service:exception handling request: HTTPConnectionPool(host='localhost', port=10001): Max retries exceeded with url: /v1/models/Servo:predict (Caused by NewConnectionError('<urllib3.connection.HTTPConnection object at 0x7efe17b5a810>: Failed to establish a new connection: [Errno 111] Connection refused'))
Traceback (most recent call last):
  File "/usr/local/lib/python3.7/site-packages/urllib3/connection.py", line 170, in _new_conn
    (self._dns_host, self.port), self.timeout, **extra_kw
  File "/usr/local/lib/python3.7/site-packages/urllib3/util/connection.py", line 96, in create_connection
    raise err
  File "/usr/local/lib/python3.7/site-packages/urllib3/util/connection.py", line 86, in create_connection
    sock.connect(sa)
  File "/usr/local/lib/python3.7/site-packages/gevent/_socketcommon.py", line 607, in connect
    raise _SocketError(err, strerror(err))
ConnectionRefusedError: [Errno 111] Connection refused

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/usr/local/lib/python3.7/site-packages/urllib3/connectionpool.py", line 706, in urlopen
    chunked=chunked,
  File "/usr/local/lib/python3.7/site-packages/urllib3/connectionpool.py", line 394, in _make_request
    conn.request(method, url, **httplib_request_kw)
  File "/usr/local/lib/python3.7/site-packages/urllib3/connection.py", line 234, in request
    super(HTTPConnection, self).request(method, url, body=body, headers=headers)
  File "/usr/local/lib/python3.7/http/client.py", line 1277, in request
    self._send_request(method, url, body, headers, encode_chunked)
  File "/usr/local/lib/python3.7/http/client.py", line 1323, in _send_request
    self.endheaders(body, encode_chunked=encode_chunked)
  File "/usr/local/lib/python3.7/http/client.py", line 1272, in endheaders
    self._send_output(message_body, encode_chunked=encode_chunked)
  File "/usr/local/lib/python3.7/http/client.py", line 1032, in _send_output
    self.send(msg)
  File "/usr/local/lib/python3.7/http/client.py", line 972, in send
    self.connect()
  File "/usr/local/lib/python3.7/site-packages/urllib3/connection.py", line 200, in connect
    conn = self._new_conn()
  File "/usr/local/lib/python3.7/site-packages/urllib3/connection.py", line 182, in _new_conn
    self, "Failed to establish a new connection: %s" % e
urllib3.exceptions.NewConnectionError: <urllib3.connection.HTTPConnection object at 0x7efe17b5a810>: Failed to establish a new connection: [Errno 111] Connection refused

169.254.255.130 - - [14/Mar/2021:23:56:26 +0000] "POST /invocations HTTP/1.1" 500 283 "-" "Go-http-client/1.1"
INFO:python_service:http://gunicorn_upstream/invocations
input handler called # This is my print statement in my input_handler
INFO:tfs_utils:sagemaker tfs attributes: 
{}
json image produced # This is my print statement in my input_handler
ERROR:python_service:exception handling request: HTTPConnectionPool(host='localhost', port=10001): Max retries exceeded with url: /v1/models/Servo:predict (Caused by NewConnectionError('<urllib3.connection.HTTPConnection object at 0x7efe16e3b790>: Failed to establish a new connection: [Errno 111] Connection refused'))
Traceback (most recent call last):
  File "/usr/local/lib/python3.7/site-packages/urllib3/connection.py", line 170, in _new_conn
    (self._dns_host, self.port), self.timeout, **extra_kw
  File "/usr/local/lib/python3.7/site-packages/urllib3/util/connection.py", line 96, in create_connection
    raise err
  File "/usr/local/lib/python3.7/site-packages/urllib3/util/connection.py", line 86, in create_connection
    sock.connect(sa)
  File "/usr/local/lib/python3.7/site-packages/gevent/_socketcommon.py", line 607, in connect
    raise _SocketError(err, strerror(err))
ConnectionRefusedError: [Errno 111] Connection refused

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/usr/local/lib/python3.7/site-packages/urllib3/connectionpool.py", line 706, in urlopen
    chunked=chunked,
  File "/usr/local/lib/python3.7/site-packages/urllib3/connectionpool.py", line 394, in _make_request
    conn.request(method, url, **httplib_request_kw)
  File "/usr/local/lib/python3.7/site-packages/urllib3/connection.py", line 234, in request
    super(HTTPConnection, self).request(method, url, body=body, headers=headers)
  File "/usr/local/lib/python3.7/http/client.py", line 1277, in request
    self._send_request(method, url, body, headers, encode_chunked)
  File "/usr/local/lib/python3.7/http/client.py", line 1323, in _send_request
    self.endheaders(body, encode_chunked=encode_chunked)
  File "/usr/local/lib/python3.7/http/client.py", line 1272, in endheaders
    self._send_output(message_body, encode_chunked=encode_chunked)
  File "/usr/local/lib/python3.7/http/client.py", line 1032, in _send_output
    self.send(msg)
  File "/usr/local/lib/python3.7/http/client.py", line 972, in send
    self.connect()
  File "/usr/local/lib/python3.7/site-packages/urllib3/connection.py", line 200, in connect
    conn = self._new_conn()
  File "/usr/local/lib/python3.7/site-packages/urllib3/connection.py", line 182, in _new_conn
    self, "Failed to establish a new connection: %s" % e
urllib3.exceptions.NewConnectionError: <urllib3.connection.HTTPConnection object at 0x7efe16e3b790>: Failed to establish a new connection: [Errno 111] Connection refused

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/usr/local/lib/python3.7/site-packages/requests/adapters.py", line 449, in send
    timeout=timeout
  File "/usr/local/lib/python3.7/site-packages/urllib3/connectionpool.py", line 756, in urlopen
    method, url, error=e, _pool=self, _stacktrace=sys.exc_info()[2]
  File "/usr/local/lib/python3.7/site-packages/urllib3/util/retry.py", line 573, in increment
    raise MaxRetryError(_pool, url, error or ResponseError(cause))
urllib3.exceptions.MaxRetryError: HTTPConnectionPool(host='localhost', port=10001): Max retries exceeded with url: /v1/models/Servo:predict (Caused by NewConnectionError('<urllib3.connection.HTTPConnection object at 0x7efe16e3b790>: Failed to establish a new connection: [Errno 111] Connection refused'))

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/sagemaker/python_service.py", line 292, in _handle_invocation_post
    res.body, res.content_type = self._handlers(data, context)
  File "/sagemaker/python_service.py", line 325, in handler
    response = requests.post(context.rest_uri, data=processed_input)
  File "/usr/local/lib/python3.7/site-packages/requests/api.py", line 119, in post
    return request('post', url, data=data, json=json, **kwargs)
  File "/usr/local/lib/python3.7/site-packages/requests/api.py", line 61, in request
    return session.request(method=method, url=url, **kwargs)
  File "/usr/local/lib/python3.7/site-packages/requests/sessions.py", line 542, in request
    resp = self.send(prep, **send_kwargs)
  File "/usr/local/lib/python3.7/site-packages/requests/sessions.py", line 655, in send
    r = adapter.send(request, **kwargs)
  File "/usr/local/lib/python3.7/site-packages/requests/adapters.py", line 516, in send
    raise ConnectionError(e, request=request)
requests.exceptions.ConnectionError: HTTPConnectionPool(host='localhost', port=10001): Max retries exceeded with url: /v1/models/Servo:predict (Caused by NewConnectionError('<urllib3.connection.HTTPConnection object at 0x7efe16e3b790>: Failed to establish a new connection: [Errno 111] Connection refused'))
169.254.255.130 - - [14/Mar/2021:23:56:26 +0000] "POST /invocations HTTP/1.1" 500 283 "-" "Go-http-client/1.1"
INFO:python_service:http://gunicorn_upstream/invocations
input handler called
INFO:tfs_utils:sagemaker tfs attributes: 
{}
json image produced
ERROR:python_service:exception handling request: HTTPConnectionPool(host='localhost', port=10001): Max retries exceeded with url: /v1/models/Servo:predict (Caused by NewConnectionError('<urllib3.connection.HTTPConnection object at 0x7efe16e3b4d0>: Failed to establish a new connection: [Errno 111] Connection refused'))
Traceback (most recent call last):
  File "/usr/local/lib/python3.7/site-packages/urllib3/connection.py", line 170, in _new_conn
    (self._dns_host, self.port), self.timeout, **extra_kw
  File "/usr/local/lib/python3.7/site-packages/urllib3/util/connection.py", line 96, in create_connection
    raise err
  File "/usr/local/lib/python3.7/site-packages/urllib3/util/connection.py", line 86, in create_connection
    sock.connect(sa)
  File "/usr/local/lib/python3.7/site-packages/gevent/_socketcommon.py", line 607, in connect
    raise _SocketError(err, strerror(err))
ConnectionRefusedError: [Errno 111] Connection refused

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/usr/local/lib/python3.7/site-packages/urllib3/connectionpool.py", line 706, in urlopen
    chunked=chunked,
  File "/usr/local/lib/python3.7/site-packages/urllib3/connectionpool.py", line 394, in _make_request
    conn.request(method, url, **httplib_request_kw)
  File "/usr/local/lib/python3.7/site-packages/urllib3/connection.py", line 234, in request
    super(HTTPConnection, self).request(method, url, body=body, headers=headers)
  File "/usr/local/lib/python3.7/http/client.py", line 1277, in request
    self._send_request(method, url, body, headers, encode_chunked)
  File "/usr/local/lib/python3.7/http/client.py", line 1323, in _send_request
    self.endheaders(body, encode_chunked=encode_chunked)
  File "/usr/local/lib/python3.7/http/client.py", line 1272, in endheaders
    self._send_output(message_body, encode_chunked=encode_chunked)
  File "/usr/local/lib/python3.7/http/client.py", line 1032, in _send_output
    self.send(msg)
  File "/usr/local/lib/python3.7/http/client.py", line 972, in send
    self.connect()
  File "/usr/local/lib/python3.7/site-packages/urllib3/connection.py", line 200, in connect
    conn = self._new_conn()
  File "/usr/local/lib/python3.7/site-packages/urllib3/connection.py", line 182, in _new_conn
    self, "Failed to establish a new connection: %s" % e
urllib3.exceptions.NewConnectionError: <urllib3.connection.HTTPConnection object at 0x7efe16e3b4d0>: Failed to establish a new connection: [Errno 111] Connection refused

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/usr/local/lib/python3.7/site-packages/requests/adapters.py", line 449, in send
    timeout=timeout
  File "/usr/local/lib/python3.7/site-packages/urllib3/connectionpool.py", line 756, in urlopen
    method, url, error=e, _pool=self, _stacktrace=sys.exc_info()[2]
  File "/usr/local/lib/python3.7/site-packages/urllib3/util/retry.py", line 573, in increment
    raise MaxRetryError(_pool, url, error or ResponseError(cause))
urllib3.exceptions.MaxRetryError: HTTPConnectionPool(host='localhost', port=10001): Max retries exceeded with url: /v1/models/Servo:predict (Caused by NewConnectionError('<urllib3.connection.HTTPConnection object at 0x7efe16e3b4d0>: Failed to establish a new connection: [Errno 111] Connection refused'))

...

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/usr/local/lib/python3.7/site-packages/requests/adapters.py", line 449, in send
    timeout=timeout
  File "/usr/local/lib/python3.7/site-packages/urllib3/connectionpool.py", line 756, in urlopen
    method, url, error=e, _pool=self, _stacktrace=sys.exc_info()[2]
  File "/usr/local/lib/python3.7/site-packages/urllib3/util/retry.py", line 573, in increment
    raise MaxRetryError(_pool, url, error or ResponseError(cause))
urllib3.exceptions.MaxRetryError: HTTPConnectionPool(host='localhost', port=10001): Max retries exceeded with url: /v1/models/Servo:predict (Caused by NewConnectionError('<urllib3.connection.HTTPConnection object at 0x7efe16e1ba10>: Failed to establish a new connection: [Errno 111] Connection refused'))

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/sagemaker/python_service.py", line 292, in _handle_invocation_post
    res.body, res.content_type = self._handlers(data, context)
  File "/sagemaker/python_service.py", line 325, in handler
    response = requests.post(context.rest_uri, data=processed_input)
  File "/usr/local/lib/python3.7/site-packages/requests/api.py", line 119, in post
    return request('post', url, data=data, json=json, **kwargs)
  File "/usr/local/lib/python3.7/site-packages/requests/api.py", line 61, in request
    return session.request(method=method, url=url, **kwargs)
  File "/usr/local/lib/python3.7/site-packages/requests/sessions.py", line 542, in request
    resp = self.send(prep, **send_kwargs)
  File "/usr/local/lib/python3.7/site-packages/requests/sessions.py", line 655, in send
    r = adapter.send(request, **kwargs)
  File "/usr/local/lib/python3.7/site-packages/requests/adapters.py", line 516, in send
    raise ConnectionError(e, request=request)
requests.exceptions.ConnectionError: HTTPConnectionPool(host='localhost', port=10001): Max retries exceeded with url: /v1/models/Servo:predict (Caused by NewConnectionError('<urllib3.connection.HTTPConnection object at 0x7efe16e1ba10>: Failed to establish a new connection: [Errno 111] Connection refused'))
2021-03-14 23:56:27.554814: I external/org_tensorflow/tensorflow/cc/saved_model/loader.cc:190] Running initialization op on SavedModel bundle at path: /opt/ml/model/export/Servo/1
2021-03-14 23:56:27.901896: I external/org_tensorflow/tensorflow/cc/saved_model/loader.cc:277] SavedModel load for tags { serve }; Status: success: OK. Took 3161163 microseconds.
2021-03-14 23:56:27.987438: I tensorflow_serving/servables/tensorflow/saved_model_warmup_util.cc:59] No warmup data file found at /opt/ml/model/export/Servo/1/assets.extra/tf_serving_warmup_requests
2021-03-14 23:56:27.988975: I tensorflow_serving/util/retrier.cc:46] Retrying of Loading servable: {name: Servo version: 1} exhausted max_num_retries: 0
2021-03-14 23:56:27.989006: I tensorflow_serving/core/loader_harness.cc:87] Successfully loaded servable version {name: Servo version: 1}
2021-03-14 23:56:27.991672: I tensorflow_serving/model_servers/server.cc:371] Running gRPC ModelServer at 0.0.0.0:10000 ...
[warn] getaddrinfo: address family for nodename not supported
2021-03-14 23:56:27.992764: I tensorflow_serving/model_servers/server.cc:391] Exporting HTTP/REST API at:localhost:10001 ...
[evhttp_server.cc : 238] NET_LOG: Entering the event loop ...

2021-03-14T23:56:26.474:[sagemaker logs]: MaxConcurrentTransforms=1, MaxPayloadInMB=1, BatchStrategy=MULTI_RECORD
2021-03-14T23:56:26.572:[sagemaker logs]: cvmodel/images/123_1024x768.jpg: Bad HTTP status received from algorithm: 500
2021-03-14T23:56:26.573:[sagemaker logs]: cvmodel/images/123_1024x768.jpg: 
2021-03-14T23:56:26.573:[sagemaker logs]: cvmodel/images/123_1024x768.jpg: Message:
2021-03-14T23:56:26.573:[sagemaker logs]: cvmodel/images/123_1024x768.jpg: {"error": "HTTPConnectionPool(host='localhost', port=10001): Max retries exceeded with url: /v1/models/Servo:predict (Caused by NewConnectionError('<urllib3.connection.HTTPConnection object at 0x7efe16e2cf10>: Failed to establish a new connection: [Errno 111] Connection refused'))"}
2021-03-14T23:56:26.645:[sagemaker logs]: cvmodel/images/124_1024x768.jpg: Bad HTTP status received from algorithm: 500
2021-03-14T23:56:26.645:[sagemaker logs]: cvmodel/images/124_1024x768.jpg: 
2021-03-14T23:56:26.645:[sagemaker logs]: cvmodel/images/124_1024x768.jpg: Message:
2021-03-14T23:56:26.645:[sagemaker logs]: cvmodel/images/124_1024x768.jpg: {"error": "HTTPConnectionPool(host='localhost', port=10001): Max retries exceeded with url: /v1/models/Servo:predict (Caused by NewConnectionError('<urllib3.connection.HTTPConnection object at 0x7efe16e1ba10>: Failed to establish a new connection: [Errno 111] Connection refused'))"}

aws / sagemaker-tensorflow-serving-container

Batch Transform function starts sending image inference requests before model is actually loaded #189