amaharek commented 3 years ago

Describe the bug This issue is related to the issue https://github.com/aws/sagemaker-python-sdk/issues/1275

JVM detect the CPU count as 1 when more CPUs are available for the container.

To reproduce

Clone the SaeMaker example
Deploy the model using the same endpoint.
Check CloudWatch logs and the number of CPU cores detected will be like Number of CPUs: 1

Expected behavior The CPU count from CloudWatch should match the CPU count for the used instance. For example, 4 if the instance is ml.m4.xlarge

System information Container: pytorch-inference:1.5-gpu-py3 SageMaker inference v1.1.2

daniel-hanmoi-choi commented 3 years ago

@amaharek We had the same issue and fixed with this

TOOLKIT_PATH=python -c "import sagemaker_inference;print(sagemaker_inference.__path__[0])"

Add the following to the Dockerfile and build a new image based on the above

Single-model

RUN echo "vmargs=-XX:-UseContainerSupport" >> $TOOLKIT_PATH/etc/default-mms.properties

Multi-model

RUN echo "vmargs=-XX:-UseContainerSupport" >> $TOOLKIT_PATH/etc/mme-mms.properties Correspondence

About `UserContainerSupport`

https://www.eclipse.org/openj9/docs/xxusecontainersupport/ https://blog.softwaremill.com/docker-support-in-new-java-8-finally-fd595df0ca54

amaharek commented 3 years ago

PR 83 has been merged

aws / sagemaker-inference-toolkit

JVM detect the CPU count as 1 when more CPUs are available for the container. #82

Single-model

Multi-model

About `UserContainerSupport`

aws / sagemaker-inference-toolkit

JVM detect the CPU count as 1 when more CPUs are available for the container. #82

Single-model

Multi-model

About UserContainerSupport

About `UserContainerSupport`