aws / sagemaker-pytorch-inference-toolkit

Toolkit for allowing inference and serving with PyTorch on SageMaker. Dockerfiles used for building SageMaker Pytorch Containers are at https://github.com/aws/deep-learning-containers.
Apache License 2.0
131 stars 70 forks source link

Specify batch size for MME #134

Open austinmw opened 1 year ago

austinmw commented 1 year ago

What did you find confusing? Please describe. How do you specify batch size for MME models?

Describe how documentation can be improved This blog describes using env vars to set batch size and other parameters for a single-model endpoint, however, I haven't found any documentation on setting batch size for individual models within a MME.

Additional context Each model in my MME has a MAR-INF/MANIFEST.json within its model.tar.gz, so I tried to specify batchSize in these files, but I don't think it's being applied.