PrithivirajDamodaran / FlashRank

Lite & Super-fast re-ranking for your search & retrieval pipelines. Supports SoTA Listwise and Pairwise reranking based on LLMs and cross-encoders and more. Created by Prithivi Da, open for PRs & Collaborations.
Apache License 2.0
595 stars 44 forks source link

How to deploy a FlashRank Reranker model on AWS LAMBDA as specified? #20

Closed NeevrajKB closed 2 months ago

NeevrajKB commented 4 months ago

Hi! I'm Neevraj, Looking to deploy a reranking model using FlashRank, you mentioned Lowest $ on AWS Lambda, so was wondering how I could deploy a model on it using your framework. Also how could I contribute to the supported models? Thanks!

PrithivirajDamodaran commented 4 months ago

How are you writing your lambda handler ? are you using frameworks like serverless.com ?, You should be able to write AWS lambda handlers as HTTP /POST or /GET for interactive serving and with Say AWS Queues for batch serving on your own. The scope of the deployment patterns is only to guide you on best pratices like loading a model / where to keep your models on startup and serve it.