Paradigms of deployment 1. Live server 2. Batch processing 3. Serverless
Benefits of serverless
Deploying transformer models on Lambda
Exposing API
Versioning lambdas
CI/CD with GitHub actions
Runtime limitations and consequences
Multi-tenant design for lambdas
Conclusion
Key takeaways
Learn to deploy transformers in production
Serverless can be really good for many scenarios
Get huge instant scalability with serverless
Tons of savings in cost and headache
KeyPoints
Lambda is 6x costlier than the EC2 instance. If the sparse load is less than 1/6th normal load then profit else waste of money
Amount of memory running and $ for that.
suitable for sparse loads for batch predictions
deploy quickly without much mlops required
when prediction time is too much, then lambda does not fit
Reproduce the work performed on End2End Serverless Transformers On AWS Lambda For NLP
Expected Outcomes
References
Agenda
Paradigms of deployment 1. Live server 2. Batch processing 3. Serverless Benefits of serverless Deploying transformer models on Lambda Exposing API Versioning lambdas CI/CD with GitHub actions Runtime limitations and consequences Multi-tenant design for lambdas Conclusion
Key takeaways
Learn to deploy transformers in production Serverless can be really good for many scenarios Get huge instant scalability with serverless Tons of savings in cost and headache
KeyPoints