This sample solution shows you how to run and scale ML inference using AWS serverless services: AWS Lambda and AWS Fargate. This is demonstrated using an image classification use case.
The following diagram illustrates the solutions architecture for both batch and real-time inference options.
Clone the GitHub repo
git clone https://github.com/aws-samples/aws-serverless-for-machine-learning-inference.git
Navigate to the /install
directory and deploy the CDK application.
./install.sh
or
If using Cloud9:
./cloud9_install.sh
Enter Y
to proceed with the deployment on the confirmation screen.
The solution lets you get predictions for either a set of images using batch inference or for a single image at a time using real-time API end-point.
Get batch predictions by uploading image files to Amazon S3.
aws s3 cp <path to jpeg files> s3://ml-serverless-bucket-<acct-id>-<aws-region>/input/ --recursive
Get real-time predictions by invoking the API endpoint with an image payload.
curl -v -H "Content-Type: application/jpeg" --data-binary @<your jpg file name> <your-api-endpoint-url>/predict
Navigate to the /app
directory from the terminal window and run the following command to destroy all resources and avoid incurring future charges.
cdk destroy -f
See CONTRIBUTING for more information.
This library is licensed under the MIT-0 License. See the LICENSE file.