aws / amazon-sagemaker-examples

Example 📓 Jupyter notebooks that demonstrate how to build, train, and deploy machine learning models using 🧠 Amazon SageMaker.
https://sagemaker-examples.readthedocs.io
Apache License 2.0
10.1k stars 6.77k forks source link

[Bug Report]: shm_size issue while deploying ensemble models on triton #3506

Open farzanehnakhaee70 opened 2 years ago

farzanehnakhaee70 commented 2 years ago

Describe the bug I am deploying an ensemble of an NLP model. While running the code specified, I get this error:

Unable to initialize shared memory key 'triton_python_backend_shm_region_2' to requested size

Based on my investigation, each of the directories with python_backend, needs 64MB of shm. On the other hand, there isn't any option to change the shm_size of the container. Then, how we can solve the problem?

wenestam commented 4 months ago

Hi!

Almost 2 years later, I too experienced this problem. Can't deploy multiple models in Sagemaker due to the fact I can't pass the "--shm-size" flag to the container...

Did you ever resolve this? Must I switch from python_backend in order to deploy?