serverless transformers using aws lambda

Reproduce the work performed on End2End Serverless Transformers On AWS Lambda For NLP

Expected Outcomes

Understand how AWS lambda works for a classification usecase.

References

Deck - bit.ly/serverless-transformers
Github Repository - https://github.com/bhavsarpratik/serverless-transformers-on-aws-lambda
Talk - https://hasgeek.com/fifthelephant/mlops-conference/sub/end2end-serverless-transformers-on-aws-lambda-for-5C188GEQgJZx4sBjQxQZGg

Agenda

Paradigms of deployment 1. Live server 2. Batch processing 3. Serverless Benefits of serverless Deploying transformer models on Lambda Exposing API Versioning lambdas CI/CD with GitHub actions Runtime limitations and consequences Multi-tenant design for lambdas Conclusion

Key takeaways

Learn to deploy transformers in production Serverless can be really good for many scenarios Get huge instant scalability with serverless Tons of savings in cost and headache

KeyPoints

Lambda is 6x costlier than the EC2 instance. If the sparse load is less than 1/6th normal load then profit else waste of money
Amount of memory running and $ for that.
suitable for sparse loads for batch predictions
deploy quickly without much mlops required
when prediction time is too much, then lambda does not fit
batch predictions can be replaced with serverless
lambda can only run upto 15 min
suitable for inference.

manisnesan / fastchai