aws / sagemaker-pytorch-inference-toolkit

Toolkit for allowing inference and serving with PyTorch on SageMaker. Dockerfiles used for building SageMaker Pytorch Containers are at https://github.com/aws/deep-learning-containers.
Apache License 2.0
131 stars 70 forks source link

Improve debuggability during model load and inference failures #163

Closed namannandan closed 5 months ago

namannandan commented 5 months ago

Describe the feature you'd like Enable logging errors with traceback during model load and inference to help with debugging.

Current implementation: https://github.com/aws/sagemaker-pytorch-inference-toolkit/blob/c7365b94985b65e11c2b3d48a6004db48f87a7d2/src/sagemaker_inference/transformer.py#L159-L168

How would this feature be used? Please describe. Errors during model loading and inference will be logged.

Describe alternatives you've considered N/A

Additional context This is useful in scenarios where there's no direct access to an endpoint where a model is deployed, for ex: Sagemaker Endpoint, where we only have access to logs.