breaking: Change Model server to Torchserve for PyTorch Inference

dhanainme commented 4 years ago

Change Model server to Torchserve for PyTorch Inference

Use TorchServe in place of MMS for Pytorch Inference.

This PR depends on PR in sagemaker-inference-toolkit #58 to be merged before this can be supported. Hence existing integ tests should likely fail because of the same.

Testing :

Tested with SageMaker local based on the buildspec.yaml file.

TOX_PARALLEL_NO_SPINNER=1
PY_COLORS=0
AWS_ACCESS_KEY_ID=XYZ
AWS_SECRET_ACCESS_KEY=XYZ
AWS_SESSION_TOKEN=XYZ

tox -e py36 -- test/integration/local --build-image -s

This would fail with the following error & create a docker image : sagemaker-pytorch-inference:1.5.0-cpu-py3

Attaching to tmpce8xadei_algo-1-j3bfm_1
algo-1-j3bfm_1  | Traceback (most recent call last):
algo-1-j3bfm_1  |   File "/usr/local/bin/dockerd-entrypoint.py", line 21, in <module>
algo-1-j3bfm_1  |     from sagemaker_pytorch_serving_container import serving
algo-1-j3bfm_1  |   File "/opt/conda/lib/python3.7/site-packages/sagemaker_pytorch_serving_container/serving.py", line 18, in <module>
algo-1-j3bfm_1  |     from sagemaker_inference import torchserve
algo-1-j3bfm_1  | ImportError: cannot import name 'torchserve' from 'sagemaker_inference' (/opt/conda/lib/python3.7/site-packages/sagemaker_inference/__init__.py)
tmpce8xadei_algo-1-j3bfm_1 exited with code 1
Aborting on container exit...
Exception in thread Thread-1:

Now install chanages from PR - sagemaker-inference-toolkit (#89) manualy to this container & commit it before running the test again this time without the --build-image flag.

tox -e py36 -- test/integration/local -s

ubuntu@ip-172-31-65-0:~/ts/sagemaker-pytorch-inference-toolkit$ tox -e py36 -- test/integration/local -s
GLOB sdist-make: /home/ubuntu/ts/sagemaker-pytorch-inference-toolkit/setup.py
py36 inst-nodeps: /home/ubuntu/ts/sagemaker-pytorch-inference-toolkit/.tox/dist/sagemaker_pytorch_inference-1.5.2.dev0.zip
py36 installed: apipkg==1.5,attrs==19.3.0,bcrypt==3.1.7,boto3==1.14.19,botocore==1.17.19,certifi==2020.6.20,cffi==1.14.0,chardet==3.0.4,click==7.1.2,coverage==5.2,cryptography==2.9.2,docutils==0.15.2,execnet==1.7.1,Flask==1.1.1,future==0.18.2,gevent==20.6.2,greenlet==0.4.16,gunicorn==20.0.4,idna==2.7,importlib-metadata==1.7.0,inotify-simple==1.2.1,itsdangerous==1.1.0,Jinja2==2.11.2,jmespath==0.10.0,MarkupSafe==1.1.1,mock==4.0.2,more-itertools==8.4.0,numpy==1.19.0,packaging==20.4,paramiko==2.7.1,Pillow==7.2.0,pkg-resources==0.0.0,pluggy==0.13.1,protobuf==3.12.2,protobuf3-to-dict==0.1.5,psutil==5.7.0,py==1.9.0,pycparser==2.20,PyNaCl==1.4.0,pyparsing==2.4.7,pytest==5.4.3,pytest-cov==2.10.0,pytest-forked==1.2.0,pytest-xdist==1.32.0,python-dateutil==2.8.1,PyYAML==5.3.1,requests==2.20.0,retrying==1.3.3,s3transfer==0.3.3,sagemaker==1.68.0,sagemaker-containers==2.8.6.post2,sagemaker-inference==1.3.2.post1,sagemaker-pytorch-inference @ file:///home/ubuntu/ts/sagemaker-pytorch-inference-toolkit/.tox/dist/sagemaker_pytorch_inference-1.5.2.dev0.zip,scipy==1.5.1,six==1.15.0,smdebug-rulesconfig==0.1.4,torch==1.5.1,torchvision==0.6.1,typing==3.7.4.1,urllib3==1.22,wcwidth==0.2.5,Werkzeug==1.0.1,zipp==3.1.0,zope.event==4.4,zope.interface==5.1.0
py36 runtests: PYTHONHASHSEED='2603046058'
py36 runtests: commands[0] | coverage run --rcfile .coveragerc --source sagemaker_pytorch_serving_container -m pytest test/integration/local -s
WARNING:root:pandas failed to import. Analytics features will be impaired or broken.
=========================================================================================================== test session starts ============================================================================================================
platform linux -- Python 3.6.9, pytest-5.4.3, py-1.9.0, pluggy-0.13.1 -- /home/ubuntu/ts/sagemaker-pytorch-inference-toolkit/.tox/py36/bin/python3.6
cachedir: .pytest_cache
rootdir: /home/ubuntu/ts/sagemaker-pytorch-inference-toolkit, inifile: setup.cfg
plugins: forked-1.2.0, cov-2.10.0, xdist-1.32.0
collected 4 items

test/integration/local/test_serving.py::test_serve_json_npy WARNING:sagemaker:Parameter image will be renamed to image_uri in SageMaker Python SDK v2.
WARNING:sagemaker:No framework_version specified, defaulting to version 0.4. framework_version will be required in SageMaker Python SDK v2. This is not the latest supported version. If you would like to use version 1.5.0, please add framework_version=1.5.0 to your constructor.
INFO:botocore.credentials:Found credentials in environment variables.
WARNING:sagemaker.local.image:Using the short-lived AWS credentials found in session. They might expire while running.
WARNING:urllib3.connectionpool:Retrying (Retry(total=2, connect=None, read=None, redirect=None, status=None)) after connection broken by 'NewConnectionError('<urllib3.connection.HTTPConnection object at 0x7f20ef1a18d0>: Failed to establish a new connection: [Errno 111] Connection refused',)': /ping
WARNING:urllib3.connectionpool:Retrying (Retry(total=1, connect=None, read=None, redirect=None, status=None)) after connection broken by 'NewConnectionError('<urllib3.connection.HTTPConnection object at 0x7f20ef1a1e10>: Failed to establish a new connection: [Errno 111] Connection refused',)': /ping
WARNING:urllib3.connectionpool:Retrying (Retry(total=0, connect=None, read=None, redirect=None, status=None)) after connection broken by 'NewConnectionError('<urllib3.connection.HTTPConnection object at 0x7f20ef1a17b8>: Failed to establish a new connection: [Errno 111] Connection refused',)': /ping
Attaching to tmphbv2jo9s_algo-1-xr0n7_1
algo-1-xr0n7_1  | Model server started.
algo-1-xr0n7_1  | 2020-07-10 13:03:33,174 [INFO ] pool-1-thread-17 ACCESS_LOG - /172.19.0.1:50306 "GET /ping HTTP/1.1" 200 4
!algo-1-xr0n7_1  | 2020-07-10 13:03:33,674 [INFO ] W-9005-model_1 ACCESS_LOG - /172.19.0.1:50310 "POST /invocations HTTP/1.1" 200 132
algo-1-xr0n7_1  | 2020-07-10 13:03:34,124 [INFO ] W-9006-model_1 ACCESS_LOG - /172.19.0.1:50310 "POST /invocations HTTP/1.1" 200 125
algo-1-xr0n7_1  | 2020-07-10 13:03:34,551 [INFO ] W-9009-model_1 ACCESS_LOG - /172.19.0.1:50310 "POST /invocations HTTP/1.1" 200 122
algo-1-xr0n7_1  | 2020-07-10 13:03:34,718 [INFO ] W-9015-model_1 ACCESS_LOG - /172.19.0.1:50310 "POST /invocations HTTP/1.1" 200 37
algo-1-xr0n7_1  | 2020-07-10 13:03:34,889 [INFO ] W-9000-model_1 ACCESS_LOG - /172.19.0.1:50310 "POST /invocations HTTP/1.1" 200 43
algo-1-xr0n7_1  | 2020-07-10 13:03:35,052 [INFO ] W-9010-model_1 ACCESS_LOG - /172.19.0.1:50310 "POST /invocations HTTP/1.1" 200 28
Gracefully stopping... (press Ctrl+C again to force)
PASSED
test/integration/local/test_serving.py::test_serve_csv WARNING:sagemaker:Parameter image will be renamed to image_uri in SageMaker Python SDK v2.
WARNING:sagemaker:No framework_version specified, defaulting to version 0.4. framework_version will be required in SageMaker Python SDK v2. This is not the latest supported version. If you would like to use version 1.5.0, please add framework_version=1.5.0 to your constructor.
WARNING:sagemaker.local.image:Using the short-lived AWS credentials found in session. They might expire while running.
WARNING:urllib3.connectionpool:Retrying (Retry(total=2, connect=None, read=None, redirect=None, status=None)) after connection broken by 'NewConnectionError('<urllib3.connection.HTTPConnection object at 0x7f20ec9c8080>: Failed to establish a new connection: [Errno 111] Connection refused',)': /ping
WARNING:urllib3.connectionpool:Retrying (Retry(total=1, connect=None, read=None, redirect=None, status=None)) after connection broken by 'NewConnectionError('<urllib3.connection.HTTPConnection object at 0x7f20ec9e3240>: Failed to establish a new connection: [Errno 111] Connection refused',)': /ping
WARNING:urllib3.connectionpool:Retrying (Retry(total=0, connect=None, read=None, redirect=None, status=None)) after connection broken by 'NewConnectionError('<urllib3.connection.HTTPConnection object at 0x7f20ec9e3c50>: Failed to establish a new connection: [Errno 111] Connection refused',)': /ping
Attaching to tmp1evpzg0n_algo-1-1us1t_1
algo-1-1us1t_1  | Model server started.
algo-1-1us1t_1  | 2020-07-10 13:03:56,114 [INFO ] pool-1-thread-17 ACCESS_LOG - /172.19.0.1:50328 "GET /ping HTTP/1.1" 200 3
!algo-1-1us1t_1  | 2020-07-10 13:03:56,305 [INFO ] W-9009-model_1 ACCESS_LOG - /172.19.0.1:50332 "POST /invocations HTTP/1.1" 200 30
algo-1-1us1t_1  | 2020-07-10 13:03:56,493 [INFO ] W-9000-model_1 ACCESS_LOG - /172.19.0.1:50332 "POST /invocations HTTP/1.1" 200 47
algo-1-1us1t_1  | 2020-07-10 13:03:56,678 [INFO ] W-9013-model_1 ACCESS_LOG - /172.19.0.1:50332 "POST /invocations HTTP/1.1" 200 23
Gracefully stopping... (press Ctrl+C again to force)
PASSED
test/integration/local/test_serving.py::test_serve_cpu_model_on_gpu WARNING:sagemaker:Parameter image will be renamed to image_uri in SageMaker Python SDK v2.
WARNING:sagemaker:No framework_version specified, defaulting to version 0.4. framework_version will be required in SageMaker Python SDK v2. This is not the latest supported version. If you would like to use version 1.5.0, please add framework_version=1.5.0 to your constructor.
WARNING:sagemaker.local.image:Using the short-lived AWS credentials found in session. They might expire while running.
WARNING:urllib3.connectionpool:Retrying (Retry(total=2, connect=None, read=None, redirect=None, status=None)) after connection broken by 'NewConnectionError('<urllib3.connection.HTTPConnection object at 0x7f20ec9e6b38>: Failed to establish a new connection: [Errno 111] Connection refused',)': /ping
WARNING:urllib3.connectionpool:Retrying (Retry(total=1, connect=None, read=None, redirect=None, status=None)) after connection broken by 'NewConnectionError('<urllib3.connection.HTTPConnection object at 0x7f20ec9e6f28>: Failed to establish a new connection: [Errno 111] Connection refused',)': /ping
WARNING:urllib3.connectionpool:Retrying (Retry(total=0, connect=None, read=None, redirect=None, status=None)) after connection broken by 'NewConnectionError('<urllib3.connection.HTTPConnection object at 0x7f20ec9e20b8>: Failed to establish a new connection: [Errno 111] Connection refused',)': /ping
Attaching to tmpwuf8pafw_algo-1-eqkyb_1
algo-1-eqkyb_1  | Model server started.
algo-1-eqkyb_1  | 2020-07-10 13:04:17,696 [INFO ] pool-1-thread-17 ACCESS_LOG - /172.19.0.1:50350 "GET /ping HTTP/1.1" 200 4
!algo-1-eqkyb_1  | 2020-07-10 13:04:17,886 [INFO ] W-9015-model_1 ACCESS_LOG - /172.19.0.1:50354 "POST /invocations HTTP/1.1" 200 30
Gracefully stopping... (press Ctrl+C again to force)
PASSED
test/integration/local/test_serving.py::test_serving_calls_model_fn_once WARNING:sagemaker:Parameter image will be renamed to image_uri in SageMaker Python SDK v2.
WARNING:sagemaker:No framework_version specified, defaulting to version 0.4. framework_version will be required in SageMaker Python SDK v2. This is not the latest supported version. If you would like to use version 1.5.0, please add framework_version=1.5.0 to your constructor.
WARNING:sagemaker.local.image:Using the short-lived AWS credentials found in session. They might expire while running.
WARNING:urllib3.connectionpool:Retrying (Retry(total=2, connect=None, read=None, redirect=None, status=None)) after connection broken by 'NewConnectionError('<urllib3.connection.HTTPConnection object at 0x7f20ec870048>: Failed to establish a new connection: [Errno 111] Connection refused',)': /ping
WARNING:urllib3.connectionpool:Retrying (Retry(total=1, connect=None, read=None, redirect=None, status=None)) after connection broken by 'NewConnectionError('<urllib3.connection.HTTPConnection object at 0x7f20ec870320>: Failed to establish a new connection: [Errno 111] Connection refused',)': /ping
WARNING:urllib3.connectionpool:Retrying (Retry(total=0, connect=None, read=None, redirect=None, status=None)) after connection broken by 'NewConnectionError('<urllib3.connection.HTTPConnection object at 0x7f20ec8700f0>: Failed to establish a new connection: [Errno 111] Connection refused',)': /ping
Attaching to tmpzfzvz320_algo-1-61bac_1
algo-1-61bac_1  | Model server started.
algo-1-61bac_1  | 2020-07-10 13:04:38,944 [INFO ] pool-1-thread-3 ACCESS_LOG - /172.19.0.1:50372 "GET /ping HTTP/1.1" 200 4
!algo-1-61bac_1  | 2020-07-10 13:04:38,968 [INFO ] W-9000-model_1 ACCESS_LOG - /172.19.0.1:50376 "POST /invocations HTTP/1.1" 200 3
algo-1-61bac_1  | 2020-07-10 13:04:38,975 [INFO ] W-9001-model_1 ACCESS_LOG - /172.19.0.1:50376 "POST /invocations HTTP/1.1" 200 1
algo-1-61bac_1  | 2020-07-10 13:04:38,981 [INFO ] W-9000-model_1 ACCESS_LOG - /172.19.0.1:50376 "POST /invocations HTTP/1.1" 200 1
Gracefully stopping... (press Ctrl+C again to force)
PASSED

============================================================================================================= warnings summary =============================================================================================================
test/integration/local/test_serving.py:69
  /home/ubuntu/ts/sagemaker-pytorch-inference-toolkit/test/integration/local/test_serving.py:69: PytestUnknownMarkWarning: Unknown pytest.mark.skip_cpu - is this a typo?  You can register custom marks to avoid this warning - for details, see https://docs.pytest.org/en/latest/mark.html
    @pytest.mark.skip_cpu

-- Docs: https://docs.pytest.org/en/latest/warnings.html
================================================================================================= 4 passed, 1 warning in 91.55s (0:01:31) ==================================================================================================
Coverage.py warning: Module sagemaker_pytorch_serving_container was never imported. (module-not-imported)
Coverage.py warning: No data was collected. (no-data-collected)
py36 runtests: commands[1] | coverage report --fail-under=90 --include *sagemaker_pytorch_serving_container*
No data to report.
ERROR: InvocationError: '/home/ubuntu/ts/sagemaker-pytorch-inference-toolkit/.tox/py36/bin/coverage report --fail-under=90 --include *sagemaker_pytorch_serving_container*'
_________________________________________________________________________________________________________________ summary __________________________________________________________________________________________________________________```

By submitting this pull request, I confirm that you can use, modify, copy, and redistribute this contribution, under the terms of your choice.

sagemaker-bot commented 4 years ago

AWS CodeBuild CI Report

CodeBuild project: sagemaker-pytorch-inference-container-pr
Commit ID: 8225190dcd091a4a97e8ee37c06e3ad5f6f60ffd
Result: SUCCEEDED
Build Logs (available for 30 days)

Powered by github-codebuild-logs, available on the AWS Serverless Application Repository

sagemaker-bot commented 4 years ago

AWS CodeBuild CI Report

CodeBuild project: sagemaker-pytorch-inference-toolkit-pr
Commit ID: 8225190dcd091a4a97e8ee37c06e3ad5f6f60ffd
Result: FAILED
Build Logs (available for 30 days)

Powered by github-codebuild-logs, available on the AWS Serverless Application Repository

sagemaker-bot commented 4 years ago

AWS CodeBuild CI Report

CodeBuild project: sagemaker-pytorch-inference-container-pr
Commit ID: 900b406ffcc4abf6f6fbe36bbd39c273cc7c2998
Result: SUCCEEDED
Build Logs (available for 30 days)

Powered by github-codebuild-logs, available on the AWS Serverless Application Repository

sagemaker-bot commented 4 years ago

AWS CodeBuild CI Report

CodeBuild project: sagemaker-pytorch-inference-toolkit-pr
Commit ID: 900b406ffcc4abf6f6fbe36bbd39c273cc7c2998
Result: FAILED
Build Logs (available for 30 days)

Powered by github-codebuild-logs, available on the AWS Serverless Application Repository

dhanainme commented 4 years ago

I am not sure torchserve should be part of sagemaker-inference: aws/sagemaker-inference-toolkit#58 Shouldn't that logic be moved here instead?

Have moved the logic here.

nadiaya commented 4 years ago

Flake8 failed:

flake8 run-test: commands[0] | flake8
./test/unit/test_serving.py:25:1: E302 expected 2 blank lines, found 1
./test/unit/test_model_server.py:1:1: FI11 __future__ import "absolute_import" missing
./test/unit/test_model_server.py:89:34: E225 missing whitespace around operator
./test/unit/test_model_server.py:91:34: E225 missing whitespace around operator
./test/unit/test_model_server.py:93:5: E265 block comment should start with '# '
./test/unit/test_model_server.py:115:1: F811 redefinition of unused 'test_start_torchserve_default_service_handler' from line 29
./test/unit/test_model_server.py:159:1: E303 too many blank lines (4)
./test/unit/test_model_server.py:193:9: E265 block comment should start with '# '
./test/unit/test_model_server.py:292:1: E303 too many blank lines (3)
./test/unit/test_model_server.py:297:34: E225 missing whitespace around operator
./test/unit/test_model_server.py:299:34: E225 missing whitespace around operator
./test/unit/test_model_server.py:310:1: E303 too many blank lines (3)
./test/unit/test_handler_service.py:34:49: E225 missing whitespace around operator
./test/unit/test_handler_service.py:35:39: E225 missing whitespace around operator
./test/unit/test_handler_service.py:38:39: E225 missing whitespace around operator
./test/unit/test_default_inference_handler.py:18:1: F401 'mock' imported but unused
ERROR: InvocationError for command /codebuild/output/src399464208/src/github.com/aws/sagemaker-pytorch-inference-toolkit/.tox/flake8/bin/flake8 (exited with code 1)

dhanainme commented 4 years ago

Flake8 failed:

flake8 run-test: commands[0] | flake8
./test/unit/test_serving.py:25:1: E302 expected 2 blank lines, found 1
./test/unit/test_model_server.py:1:1: FI11 __future__ import "absolute_import" missing
./test/unit/test_model_server.py:89:34: E225 missing whitespace around operator
./test/unit/test_model_server.py:91:34: E225 missing whitespace around operator
./test/unit/test_model_server.py:93:5: E265 block comment should start with '# '
./test/unit/test_model_server.py:115:1: F811 redefinition of unused 'test_start_torchserve_default_service_handler' from line 29
./test/unit/test_model_server.py:159:1: E303 too many blank lines (4)
./test/unit/test_model_server.py:193:9: E265 block comment should start with '# '
./test/unit/test_model_server.py:292:1: E303 too many blank lines (3)
./test/unit/test_model_server.py:297:34: E225 missing whitespace around operator
./test/unit/test_model_server.py:299:34: E225 missing whitespace around operator
./test/unit/test_model_server.py:310:1: E303 too many blank lines (3)
./test/unit/test_handler_service.py:34:49: E225 missing whitespace around operator
./test/unit/test_handler_service.py:35:39: E225 missing whitespace around operator
./test/unit/test_handler_service.py:38:39: E225 missing whitespace around operator
./test/unit/test_default_inference_handler.py:18:1: F401 'mock' imported but unused
ERROR: InvocationError for command /codebuild/output/src399464208/src/github.com/aws/sagemaker-pytorch-inference-toolkit/.tox/flake8/bin/flake8 (exited with code 1)

Have fixed this.

copying src/sagemaker_pytorch_serving_container/default_pytorch_inference_handler.py -> sagemaker_pytorch_inference-1.5.2.dev0/src/sagemaker_pytorch_serving_container
copying src/sagemaker_pytorch_serving_container/handler_service.py -> sagemaker_pytorch_inference-1.5.2.dev0/src/sagemaker_pytorch_serving_container
copying src/sagemaker_pytorch_serving_container/serving.py -> sagemaker_pytorch_inference-1.5.2.dev0/src/sagemaker_pytorch_serving_container
copying src/sagemaker_pytorch_serving_container/torchserve.py -> sagemaker_pytorch_inference-1.5.2.dev0/src/sagemaker_pytorch_serving_container
copying src/sagemaker_pytorch_serving_container/etc/default-ts.properties -> sagemaker_pytorch_inference-1.5.2.dev0/src/sagemaker_pytorch_serving_container/etc
copying src/sagemaker_pytorch_serving_container/etc/log4j.properties -> sagemaker_pytorch_inference-1.5.2.dev0/src/sagemaker_pytorch_serving_container/etc
Writing sagemaker_pytorch_inference-1.5.2.dev0/setup.cfg
Creating tar archive
removing 'sagemaker_pytorch_inference-1.5.2.dev0' (and everything under it)
twine runtests: commands[1] | twine check dist/*.tar.gz
Checking dist/sagemaker_pytorch_inference-1.5.2.dev0.tar.gz: PASSED, with warnings
  warning: `long_description_content_type` missing. defaulting to `text/x-rst`.
_______________________________________________________________________________________________________________________________________________________________________________ summary ________________________________________________________________________________________________________________________________________________________________________________
  flake8: commands succeeded
  twine: commands succeeded
  congratulations :)
ubuntu@ip-172-31-65-0:~/ts/sagemaker-pytorch-inference-toolkit$

sagemaker-bot commented 4 years ago

AWS CodeBuild CI Report

CodeBuild project: sagemaker-pytorch-inference-container-pr
Commit ID: 9bb211af7b5ccccb40c9828564f0c86e0100a273
Result: SUCCEEDED
Build Logs (available for 30 days)

Powered by github-codebuild-logs, available on the AWS Serverless Application Repository

sagemaker-bot commented 4 years ago

AWS CodeBuild CI Report

CodeBuild project: sagemaker-pytorch-inference-toolkit-pr
Commit ID: 9bb211af7b5ccccb40c9828564f0c86e0100a273
Result: FAILED
Build Logs (available for 30 days)

Powered by github-codebuild-logs, available on the AWS Serverless Application Repository

dhanainme commented 4 years ago

algo-1-doyas_1  | 2020-07-17 20:07:21,364 [INFO ] W-9001-model_1 org.pytorch.serve.wlm.WorkerThread - Backend response time: 569
algo-1-doyas_1  | 2020-07-17 20:07:21,364 [INFO ] W-9000-model_1 TS_METRICS - W-9000-model_1.ms:714|#Level:Host|#hostname:2c8a2bc735be,timestamp:1595016441
algo-1-doyas_1  | 2020-07-17 20:07:21,364 [INFO ] W-9001-model_1 TS_METRICS - W-9001-model_1.ms:712|#Level:Host|#hostname:2c8a2bc735be,timestamp:1595016441
algo-1-doyas_1  | 2020-07-17 20:07:23,333 [INFO ] pool-1-thread-3 ACCESS_LOG - /172.18.0.1:38292 "GET /ping HTTP/1.1" 200 3
algo-1-doyas_1  | 2020-07-17 20:07:23,334 [INFO ] pool-1-thread-3 TS_METRICS - Requests2XX.Count:1|#Level:Host|#hostname:2c8a2bc735be,timestamp:null
!algo-1-doyas_1  | 2020-07-17 20:07:23,357 [INFO ] W-9000-model_1 org.pytorch.serve.wlm.WorkerThread - Backend response time: 1
algo-1-doyas_1  | 2020-07-17 20:07:23,357 [INFO ] W-9000-model_1-stdout MODEL_METRICS - PredictionTime.Milliseconds:0.05|#ModelName:model,Level:Model|#hostname:2c8a2bc735be,requestID:f08bfa54-47bd-444a-a91a-b6aa56819ce7,timestamp:1595016443
algo-1-doyas_1  | 2020-07-17 20:07:23,358 [INFO ] W-9000-model_1 ACCESS_LOG - /172.18.0.1:38296 "POST /invocations HTTP/1.1" 200 5
algo-1-doyas_1  | 2020-07-17 20:07:23,358 [INFO ] W-9000-model_1 TS_METRICS - Requests2XX.Count:1|#Level:Host|#hostname:2c8a2bc735be,timestamp:null
algo-1-doyas_1  | 2020-07-17 20:07:23,365 [INFO ] W-9001-model_1 org.pytorch.serve.wlm.WorkerThread - Backend response time: 1
algo-1-doyas_1  | 2020-07-17 20:07:23,365 [INFO ] W-9001-model_1 ACCESS_LOG - /172.18.0.1:38296 "POST /invocations HTTP/1.1" 200 1
algo-1-doyas_1  | 2020-07-17 20:07:23,365 [INFO ] W-9001-model_1-stdout MODEL_METRICS - PredictionTime.Milliseconds:0.04|#ModelName:model,Level:Model|#hostname:2c8a2bc735be,requestID:bbeb14f3-c0c4-4de2-bf58-faf948458a31,timestamp:1595016443
algo-1-doyas_1  | 2020-07-17 20:07:23,365 [INFO ] W-9001-model_1 TS_METRICS - Requests2XX.Count:1|#Level:Host|#hostname:2c8a2bc735be,timestamp:null
algo-1-doyas_1  | 2020-07-17 20:07:23,371 [INFO ] W-9000-model_1 org.pytorch.serve.wlm.WorkerThread - Backend response time: 0
algo-1-doyas_1  | 2020-07-17 20:07:23,371 [INFO ] W-9000-model_1 ACCESS_LOG - /172.18.0.1:38296 "POST /invocations HTTP/1.1" 200 1
algo-1-doyas_1  | 2020-07-17 20:07:23,371 [INFO ] W-9000-model_1-stdout MODEL_METRICS - PredictionTime.Milliseconds:0.02|#ModelName:model,Level:Model|#hostname:2c8a2bc735be,requestID:407df61c-82c0-4059-b590-288551436fb6,timestamp:1595016443
algo-1-doyas_1  | 2020-07-17 20:07:23,371 [INFO ] W-9000-model_1 TS_METRICS - Requests2XX.Count:1|#Level:Host|#hostname:2c8a2bc735be,timestamp:null
Gracefully stopping... (press Ctrl+C again to force)
PASSED

=========================================================================================================================================================================== warnings summary ===========================================================================================================================================================================
test/integration/local/test_serving.py:69
  /home/ubuntu/ts/sagemaker-pytorch-inference-toolkit/test/integration/local/test_serving.py:69: PytestUnknownMarkWarning: Unknown pytest.mark.skip_cpu - is this a typo?  You can register custom marks to avoid this warning - for details, see https://docs.pytest.org/en/latest/mark.html
    @pytest.mark.skip_cpu

-- Docs: https://docs.pytest.org/en/latest/warnings.html
=============================================================================================================================================================== 4 passed, 1 warning in 136.40s (0:02:16) ===============================================================================================================================================================

Logs from a more recent run.

sagemaker-bot commented 4 years ago

AWS CodeBuild CI Report

CodeBuild project: sagemaker-pytorch-inference-container-pr
Commit ID: 1abf0e347bcf722cd13f3796d3d7d6efbfd50f74
Result: SUCCEEDED
Build Logs (available for 30 days)

Powered by github-codebuild-logs, available on the AWS Serverless Application Repository

sagemaker-bot commented 4 years ago

AWS CodeBuild CI Report

CodeBuild project: sagemaker-pytorch-inference-toolkit-pr
Commit ID: 1abf0e347bcf722cd13f3796d3d7d6efbfd50f74
Result: FAILED
Build Logs (available for 30 days)

Powered by github-codebuild-logs, available on the AWS Serverless Application Repository

sagemaker-bot commented 4 years ago

AWS CodeBuild CI Report

CodeBuild project: sagemaker-pytorch-inference-container-pr
Commit ID: cf38f47c126711e2d5fb0a121aea21959a1b5809
Result: SUCCEEDED
Build Logs (available for 30 days)

Powered by github-codebuild-logs, available on the AWS Serverless Application Repository

sagemaker-bot commented 4 years ago

AWS CodeBuild CI Report

CodeBuild project: sagemaker-pytorch-inference-toolkit-pr
Commit ID: cf38f47c126711e2d5fb0a121aea21959a1b5809
Result: FAILED
Build Logs (available for 30 days)

Powered by github-codebuild-logs, available on the AWS Serverless Application Repository

sagemaker-bot commented 4 years ago

AWS CodeBuild CI Report

CodeBuild project: sagemaker-pytorch-inference-container-pr
Commit ID: 8d51e40e1c158c156b2731fbbe5686cce3d28bf3
Result: SUCCEEDED
Build Logs (available for 30 days)

Powered by github-codebuild-logs, available on the AWS Serverless Application Repository

sagemaker-bot commented 4 years ago

AWS CodeBuild CI Report

CodeBuild project: sagemaker-pytorch-inference-toolkit-pr
Commit ID: 8d51e40e1c158c156b2731fbbe5686cce3d28bf3
Result: FAILED
Build Logs (available for 30 days)

Powered by github-codebuild-logs, available on the AWS Serverless Application Repository

sagemaker-bot commented 4 years ago

AWS CodeBuild CI Report

CodeBuild project: sagemaker-pytorch-inference-container-pr
Commit ID: d16ed07ebdc8ff1c5b52e3d333eb45c0106e1bdd
Result: SUCCEEDED
Build Logs (available for 30 days)

Powered by github-codebuild-logs, available on the AWS Serverless Application Repository

laurenyu commented 4 years ago

for why mms-entrypoint.py is named such - it contains a line to accommodate for the fact that MMS exits right away: https://github.com/aws/sagemaker-pytorch-inference-toolkit/blob/master/artifacts/mms-entrypoint.py#L26-L27

if Torchserve doesn't need that, then we should be able to remove the file altogether (rather than just renaming it)

dhanainme commented 4 years ago

for why mms-entrypoint.py is named such - it contains a line to accommodate for the fact that MMS exits right away: https://github.com/aws/sagemaker-pytorch-inference-toolkit/blob/master/artifacts/mms-entrypoint.py#L26-L27

if Torchserve doesn't need that, then we should be able to remove the file altogether (rather than just renaming it)

There may not be any differences between MMS & TS for this as the behaviour would not have changed.

sagemaker-bot commented 4 years ago

AWS CodeBuild CI Report

CodeBuild project: sagemaker-pytorch-inference-toolkit-pr
Commit ID: d16ed07ebdc8ff1c5b52e3d333eb45c0106e1bdd
Result: FAILED
Build Logs (available for 30 days)

Powered by github-codebuild-logs, available on the AWS Serverless Application Repository

dhanainme commented 4 years ago

Looks like 1 of the integ tests consumes DLC container for a test.

From test logs :

[Container] 2020/07/27 18:46:51 Running command test_cmd="IGNORE_COVERAGE=- tox -e py36 -- test/integration/local --build-image --push-image --dockerfile-type pytorch --region $AWS_DEFAULT_REGION --docker-base-name $ECR_REPO --aws-id $ACCOUNT --framework-version $FRAMEWORK_VERSION --processor cpu --tag $GENERIC_TAG” ✅ - PASSING

[Container] 2020/07/27 18:54:25 Running command test_cmd="IGNORE_COVERAGE=- tox -e py36 -- test/integration/local --build-image --push-image --dockerfile-type dlc.cpu --region $AWS_DEFAULT_REGION --docker-base-name $ECR_REPO --aws-id $ACCOUNT --framework-version $FRAMEWORK_VERSION --processor cpu --tag $DLC_CPU_TAG” - ❌ NOT PASSING

This may not pass until DLC PR is merged & it inturn depends on this PR

sagemaker-bot commented 4 years ago