Open fm1ch4 opened 2 years ago
Hi @fm1ch4 , I am working on the changes you suggested. If I understand correctly, we don't need the _adapt_to_ts_format function anymore. This function was using DEFAULT_TS_MODEL_NAME
to name the generated mar file, which is also being referenced here in TS_CONFIG_FILE
https://github.com/aws/sagemaker-pytorch-inference-toolkit/blob/d3fd2f33028a223d836de3f97f7ff429b6d74d00/src/sagemaker_pytorch_serving_container/torchserve.py#L164-L167
What should the model name be in config file after we remove the formatting function? I don't quite understand how torchserve processes the parameter --models model=environment.model_dir
to generate the model name.
Hey @mseth10, thanks for taking a look.
The for the --models model=environment.model_dir
command, model
is the model name - it could actually be replaced with the DEFAULT_TS_MODEL_NAME
variable instead, and we can continue using that throughout the config file. Here is the TorchServe doc on the --models parameter
: https://pytorch.org/serve/server.html:
Models to be loaded using [model_name=]model_location format. Location can be a HTTP URL, a model archive file or directory contains model archive files in MODEL_STORE.
Hi @mseth10 , do we have an estimation for a fix for this?
Hi @mseth10 , do we have an estimation for a fix for this?
Hi @Qingzi-Lan , we are currently seeing some specific test failures as mentioned in https://github.com/aws/sagemaker-pytorch-inference-toolkit/pull/118#issuecomment-1090093408 , can you help understand why these tests would pass with DLC and fail with the generic image. Also, can we skip them and get the PR merged to get the DLC customers unblocked?
Hi @mseth10 , do we have an estimation for a fix for this?
Hi @Qingzi-Lan , we are currently seeing some specific test failures as mentioned in #118 (comment) , can you help understand why these tests would pass with DLC and fail with the generic image. Also, can we skip them and get the PR merged to get the DLC customers unblocked?
what image are the test using by generic image?
Currently, the startup code will repackage the model contents in
environment.model_dir
into TS format using the TS model archiver: https://github.com/aws/sagemaker-pytorch-inference-toolkit/blob/master/src/sagemaker_pytorch_serving_container/torchserve.py#L78This causes the model contents to be read and rewritten to disk on container startup, which increases container startup time. For SageMaker Serverless Inference, this causes cold starts to be longer (even longer for larger models).
TorchServe is making a change to support loading models from a directory without the need to repackage the model as a .mar file: https://github.com/pytorch/serve/issues/1498
This issue is to request for this inference toolkit to use this new feature and avoid repackaging the model contents. I /think/ this should be as simple as removing the model archiver command execution and setting [
--models
in the TorchServe command to--models model=environment.model_dir