aws / sagemaker-pytorch-inference-toolkit

Toolkit for allowing inference and serving with PyTorch on SageMaker. Dockerfiles used for building SageMaker Pytorch Containers are at https://github.com/aws/deep-learning-containers.
Apache License 2.0
134 stars 72 forks source link

Prepend `code_dir` to `sys.path` rather than `append` #137

Open davidthomas426 opened 1 year ago

davidthomas426 commented 1 year ago

Describe the bug The bug is that the semantics of running python scripts prioritize the directory containing the script by prepending to sys.path. To have similar semantics, we should prepend code_dir to sys.path rather than append here: https://github.com/aws/sagemaker-pytorch-inference-toolkit/blob/6610a410c0cf40bcf15267abe722d20d50e77bcf/src/sagemaker_pytorch_serving_container/handler_service.py#L46-L49

Here's a quick example showing the prepend semantics when running a script from the command line.

  1. First, put this in a file './code/myscript.py' in a local shell environment:

    import sys
    print(sys.path)
  2. Then run it:

    $ pwd
    /home/ubuntu
    $ python3 ./code/main.py
    ['/home/ubuntu/code', '/usr/lib/python38.zip', '/usr/lib/python3.8', '/usr/lib/python3.8/lib-dynload', '/usr/local/lib/python3.8/dist-packages',  '/usr/lib/python3/dist-packages']

The current appending behavior would cause an issue for a customer put a filename in code_dir that clashed with an installed package. If the customer ran their inference script locally, it would load their file due to prepend semantics, but when deploying to MME with this toolkit's handler, it would prioritize the installed package instead.

The single-model endpoint case is already prepended: https://github.com/aws/sagemaker-pytorch-inference-toolkit/blob/fb65d8a982944fe82449e964bc8021a9f6e1d869/src/sagemaker_pytorch_serving_container/torchserve.py#L112-L120

Other sagemaker inference toolkits already prepend, as well. See how sagemaker-huggingface-inference-toolkit handles this (https://github.com/aws/sagemaker-huggingface-inference-toolkit/blob/2f1fae5cbb3b68299e73cc591c0a912b7cccee29/src/sagemaker_huggingface_inference_toolkit/handler_service.py#L72-L73), as well as how sagemaker-inference-toolkit and sagemaker-mxnet-inference-toolkit try to handle this (though they have their own bug in this part of the code--see https://github.com/aws/sagemaker-mxnet-inference-toolkit/issues/135).