aws / amazon-sagemaker-examples

Example 📓 Jupyter notebooks that demonstrate how to build, train, and deploy machine learning models using 🧠 Amazon SageMaker.
https://sagemaker-examples.readthedocs.io
Apache License 2.0
9.79k stars 6.66k forks source link

Torch not compiled with CUDA enabled when deploying T5 using Triton #4651

Open subhamiitk opened 1 month ago

subhamiitk commented 1 month ago

Link to the notebook https://github.com/aws/amazon-sagemaker-examples/blob/main/inference/nlp/realtime/triton/single-model/t5_pytorch_python-backend/t5_pytorch_python-backend.ipynb

Describe the bug When following this notebook, getting an error when creating the endpoint. Endpoint creation fails with error: creating server: Invalid argument - load failed for model '/opt/ml/model/::t5_pytorch': version 1 is at UNAVAILABLE state: Internal: AssertionError: error in the Cloudwatch. To reproduce Followed the above notebook for T5 model deployment, getting error at creating the endpoint.

Logs error: creating server: Invalid argument - load failed for model '/opt/ml/model/::t5_pytorch': version 1 is at UNAVAILABLE state: Internal: AssertionError: