Toolkit for allowing inference and serving with PyTorch on SageMaker. Dockerfiles used for building SageMaker Pytorch Containers are at https://github.com/aws/deep-learning-containers.
Apache License 2.0
131
stars
70
forks
source link
Migrate base inference toolkit scripts and unit tests #157
The objective of this PR is to migrate the base inference toolkit scripts and unit tests to the PyTorch inference toolkit to remove the dependency on the base inference toolkit repository.
Base inference toolkit package src/sagemaker_inference:
environment.py (original script): Any methods/attributes corresponding to parameters retained in parameters.py will be retained. The rest will be removed. Specifically, we remove the following:
All the unit tests from the base inference toolkit have been moved to the PyTorch inference toolkit. Removed any unit tests which test functions and attributes which have been removed from the base inference toolkit scripts. Specifically, we will have the following:
Unit tests test/unit:
Renamed test_default_inference_handler.py to test_default_pytorch_inference_handler.py.
Renamed test_model_server.py to test_torchserve.py.
Issue #, if available:
Description of changes:
The objective of this PR is to migrate the base inference toolkit scripts and unit tests to the PyTorch inference toolkit to remove the dependency on the base inference toolkit repository.
Base inference toolkit package
src/sagemaker_inference
:__init__.py
(original script): No change.utils.py
(original script): No change.logging.py
(original script): No change.errors.py
(original script): No change.content_types.py
(original script): No change.encoder.py
(original script): No change.decoder.py
(original script): No change.parameters.py
(original script):BASE_PATH_ENV
USER_PROGRAM_ENV
DEFAULT_INVOCATIONS_ACCEPT_ENV
MODEL_SERVER_WORKERS_ENV
MODEL_SERVER_TIMEOUT_ENV
BIND_TO_PORT_ENV
MULTI_MODEL_ENV
LOG_LEVEL_ENV
SAFE_PORT_RANGE_ENV
MODEL_SERVER_VMARGS
: While it is used in model_server.py, it is not used anywhere in torchserve.py. This is defined in default-ts.properties and mme-ts.properties.MODEL_SERVER_TIMEOUT_SECONDS_ENV
: This was required for MMS, but not TorchServe (https://github.com/aws/sagemaker-inference-toolkit/pull/129)STARTUP_TIMEOUT_ENV
: Used to retrieve a MMS server process with a custom timeout value in model_server.py. In torchserve.py, the timeout value is fixed.MAX_REQUEST_SIZE
: While it is used in model_server.py it is not used anywhere in torchserve.py.environment.py
(original script): Any methods/attributes corresponding to parameters retained inparameters.py
will be retained. The rest will be removed. Specifically, we remove the following:DEFAULT_STARTUP_TIMEOUT
,self._startup_timeout
,self.startup_timeout()
self. _model_server_timeout_seconds
,self.model_server_timeout_seconds()
DEFAULT_VMARGS
,self._vmargs
,self.vmargs()
DEFAULT_MAX_REQUEST_SIZE
,self._max_request_size_in_mb
,self. max_request_size()
model_server.py
(original script):requirements.txt
in torchserve.py.REQUIREMENTS_PATH
MMS_CONFIG_FILE
DEFAULT_HANDLER_SERVICE
DEFAULT_MMS_CONFIG_FILE
MME_MMS_CONFIG_FILE
DEFAULT_MMS_LOG_FILE
DEFAULT_MMS_MODEL_EXPORT_DIRECTORY
DEFAULT_MMS_MODEL_NAME
ENABLE_MULTI_MODEL, MODEL_STORE
PYTHON_PATH_ENV
MMS_NAMESPACE
requirements.txt
in torchserve.py._install_requirements()
_get_codeartifact_index()
_start_model_server()
_adapt_to_mms_format()
_set_python_path()
_create_model_server_config_file()
_generate_mms_config_properties()
_add_sigterm_handler()
_retry_retrieve_mms_server_process()
_retrieve_mms_server_process()
_reap_children()
_add_sigchild_handler()
default_handler_service.py
(original script): No change.default_inference_handler.py
(original script): No change.transformer.py
(original script): No change.All the unit tests from the base inference toolkit have been moved to the PyTorch inference toolkit. Removed any unit tests which test functions and attributes which have been removed from the base inference toolkit scripts. Specifically, we will have the following:
test/unit
:test_default_inference_handler.py
totest_default_pytorch_inference_handler.py
.test_model_server.py
totest_torchserve.py
.test_decoder.py
(original script): No change.test_default_handler_service.py
(original script): No change.test_default_inference_handler.py
(original script): No change.test_encoder.py
(original script): No change.test_environment.py
(original script): In the test_env(), removed the assertions for the following:env.max_request_size
env.vmargs
env.startup_timeout
env.model_server_timeout_seconds
test_model_server.py
(original script):test_install_requirements()
test_install_requirements_installation_failed()
test_install_requirements_codeartifact_invalid_arn_installation_failed()
test_install_requirements_codeartifact()
test_start_model_server_default_service_handler()
test_start_model_server_custom_handler_service()
test_adapt_to_mms_format()
test_adapt_to_mms_format_existing_path()
test_set_existing_python_path()
test_new_python_path()
test_create_model_server_config_file()
test_generate_mms_config_properties()
test_generate_mms_config_properties_default_workers()
test_add_sigterm_handler()
test_retrieve_mms_server_process()
test_retrieve_mms_server_process_no_server()
test_retrieve_mms_server_process_too_many_servers()
test_retry_retrieve_mms_server_process()
test_transformer.py
(original script): No change.test_utils.py
(original script): No change.tox.ini
: Modify thecoverage run
command to includesagemaker_inference
in the source .setup.py
: Removesagemaker_inference
as a dependency and include the base toolkit’s dependencies.test/container
: Uninstallsagemaker_inference
in the following Dockerfiles.2.0.0/Dockerfile.dlc.cpu
2.0.0/Dockerfile.dlc.gpu
2.0.1/Dockerfile.dlc.cpu
2.0.1/Dockerfile.dlc.gpu
By submitting this pull request, I confirm that you can use, modify, copy, and redistribute this contribution, under the terms of your choice.