JVM detect the CPU count as 1 when more CPUs are available for the container.

zoran-hristov commented 3 years ago

Describe the bug This issue is related to the issue JVM bug 82 in sagemaker-inference-toolkit

To reproduce

Clone the SaeMaker example Deploy the model using the same endpoint. Check CloudWatch logs and the number of CPU cores detected will be like Number of CPUs: 1 JVM detect the CPU count as 1 when more CPUs are available for the container.

Expected behavior The CPU count from CloudWatch should match the CPU count for the used instance. For example, 4 if the instance is ml.m4.xlarge

System information Container: pytorch-inference:1.7-cpu-py3 and pytorch-inference:1.7-gpu-py3 SageMaker inference v1.1.2

Additional context This clearly does not allow the usage of all CPUs on the instance for Sagemaker Inference

rajunzera commented 2 years ago

@zoran-hristov Did you find any resolution to the same issue? I am also facing the same problem. Even after setting TS_DEFAULT_WORKERS_PER_MODEL=2 in config.properties it is not getting reflected in cloudwatchlogs. In cloudwatch logs, it is clearly showing Number of CPUs: 1. I used the same example as in the repo.

zoran-hristov commented 2 years ago

Yes, I found the solution. One part is noted in subsequent Deep Learning containers release notes, but there is no fix in the images(see Known issues). it is related with with OMP_NUM_THREADS parameter. I suggest to assign value to it numberOfCPUs/2 or less. It is regulate environment variables like OMP_NUM_THREADS.

The other part is to make the enable the container support for cpu detection, especially for the JVM. So, we re-build the image with fix to override.

We are setting this in the code, as the config.properties is not used in the image. I have no explanation why they abandoned the use of config.properties

Here is a way to do it, with overwriting in Dockerfile: FROM 763104351884.dkr.ecr.eu-west-1.amazonaws.com/pytorch-inference:1.7.1-cpu-py36-ubuntu18.04

# In case standard path is not used, patch with next lines RUN echo "vmargs=-XX:-UseContainerSupport" >> /opt/conda/lib/python3.6/site-packages/sagemaker_inference/etc/default-mms.properties RUN echo "vmargs=-XX:-UseContainerSupport" >> /opt/conda/lib/python3.6/site-packages/sagemaker_pytorch_serving_container/etc/default-ts.properties RUN echo "vmargs=-XX:-UseContainerSupport" >> /opt/conda/lib/python3.6/site-packages/sagemaker_pytorch_serving_container/etc/mme-ts.properties

rajunzera commented 2 years ago

Thanks @zoran-hristov it helped me to resolve the issue.

aws / sagemaker-pytorch-inference-toolkit

JVM detect the CPU count as 1 when more CPUs are available for the container. #99