aws / sagemaker-pytorch-inference-toolkit

Toolkit for allowing inference and serving with PyTorch on SageMaker. Dockerfiles used for building SageMaker Pytorch Containers are at https://github.com/aws/deep-learning-containers.
Apache License 2.0
131 stars 70 forks source link

Launch TorchServe without repackaging model contents #117

Open fm1ch4 opened 2 years ago

fm1ch4 commented 2 years ago

Currently, the startup code will repackage the model contents in environment.model_dir into TS format using the TS model archiver: https://github.com/aws/sagemaker-pytorch-inference-toolkit/blob/master/src/sagemaker_pytorch_serving_container/torchserve.py#L78

This causes the model contents to be read and rewritten to disk on container startup, which increases container startup time. For SageMaker Serverless Inference, this causes cold starts to be longer (even longer for larger models).

TorchServe is making a change to support loading models from a directory without the need to repackage the model as a .mar file: https://github.com/pytorch/serve/issues/1498

This issue is to request for this inference toolkit to use this new feature and avoid repackaging the model contents. I /think/ this should be as simple as removing the model archiver command execution and setting [--models in the TorchServe command to --models model=environment.model_dir

mseth10 commented 2 years ago

Hi @fm1ch4 , I am working on the changes you suggested. If I understand correctly, we don't need the _adapt_to_ts_format function anymore. This function was using DEFAULT_TS_MODEL_NAME to name the generated mar file, which is also being referenced here in TS_CONFIG_FILE https://github.com/aws/sagemaker-pytorch-inference-toolkit/blob/d3fd2f33028a223d836de3f97f7ff429b6d74d00/src/sagemaker_pytorch_serving_container/torchserve.py#L164-L167 What should the model name be in config file after we remove the formatting function? I don't quite understand how torchserve processes the parameter --models model=environment.model_dir to generate the model name.

fm1ch4 commented 2 years ago

Hey @mseth10, thanks for taking a look.

The for the --models model=environment.model_dir command, model is the model name - it could actually be replaced with the DEFAULT_TS_MODEL_NAME variable instead, and we can continue using that throughout the config file. Here is the TorchServe doc on the --models parameter: https://pytorch.org/serve/server.html:

Models to be loaded using [model_name=]model_location format. Location can be a HTTP URL, a model archive file or directory contains model archive files in MODEL_STORE.

Qingzi-Lan commented 2 years ago

Hi @mseth10 , do we have an estimation for a fix for this?

mseth10 commented 2 years ago

Hi @mseth10 , do we have an estimation for a fix for this?

Hi @Qingzi-Lan , we are currently seeing some specific test failures as mentioned in https://github.com/aws/sagemaker-pytorch-inference-toolkit/pull/118#issuecomment-1090093408 , can you help understand why these tests would pass with DLC and fail with the generic image. Also, can we skip them and get the PR merged to get the DLC customers unblocked?

Qingzi-Lan commented 2 years ago

Hi @mseth10 , do we have an estimation for a fix for this?

Hi @Qingzi-Lan , we are currently seeing some specific test failures as mentioned in #118 (comment) , can you help understand why these tests would pass with DLC and fail with the generic image. Also, can we skip them and get the PR merged to get the DLC customers unblocked?

what image are the test using by generic image?