PrithivirajDamodaran / FlashRank

Lite & Super-fast re-ranking for your search & retrieval pipelines. Supports SoTA Listwise and Pairwise reranking based on LLMs and cross-encoders and more. Created by Prithivi Da, open for PRs & Collaborations.
Apache License 2.0
426 stars 37 forks source link

Deployment on Azure #22

Open vbanai opened 3 weeks ago

vbanai commented 3 weeks ago

Hello,

I would like to ask if a flask app using your super cool development FlashRank can be deployed to Azure as App Service or Container Instance. I initiated FlashRank as you suggested (ranker = Ranker(model_name="ms-marco-MiniLM-L-12-v2", cache_dir="/opt")) but somehow when using the deployed app on Azure, running stops at the point when FlashRank has been called by a function, while locally I have no such problem, even if I containerize the app. Have you faced such issue, should I consider deploying it on a more powerful VM? What do you think?

I truly appreciate all your effort creating FlashRank. Congrats, Viktor

PrithivirajDamodaran commented 3 weeks ago

Hello,

I would like to ask if a flask app using your super cool development FlashRank can be deployed to Azure as App Service or Container Instance. I initiated FlashRank as you suggested (ranker = Ranker(model_name="ms-marco-MiniLM-L-12-v2", cache_dir="/opt")) but somehow when using the deployed app on Azure, running stops at the point when FlashRank has been called by a function, while locally I have no such problem, even if I containerize the app. Have you faced such issue, should I consider deploying it on a more powerful VM? What do you think?

I truly appreciate all your effort creating FlashRank. Congrats,

Viktor

Thanks for reaching out. I and a few in the community have deployed jn both AWS lambda and GCP cloud runs without any issues. Infact flashrank is an ideal candidate for running Serverless reranking services to save cost. But I haven't personally tried azure.

Here is what I suggest: Why don't you paste / share the logs and exact issues, versions rather than saying "it stopped working", someone in the community might be able to pitch in and also it will give an idea as to what exactly is the issue.

vbanai commented 5 days ago

Thanks, truly appreciate your response.

Yes I also managed to deploy (the sample code) directly to AWS Lambda and testing it with Postman, but if I tried to deploy it as a part of a flask webapplication with Zappa, or to AWS Elastic Beanstalk I failed.

In regard to Azure App service the error message from Azure looks like this: (It seems that it stops when the code gets to firing the FlaskRank instance.)

2024-05-23T13:17:54.8054943Z warn: Microsoft.AspNetCore.Server.Kestrel[22] 2024-05-23T13:17:56.3435109Z As of "05/23/2024 13:17:39 +00:00", the heartbeat has been running for "00:00:01.9722283" which is longer than "00:00:01". This could be caused by thread pool starvation. 2024-05-23T13:19:30.3993582Z warn: Microsoft.AspNetCore.Server.Kestrel[22] 2024-05-23T13:19:31.3647632Z As of "05/23/2024 13:19:18 +00:00", the heartbeat has been running for "00:00:03.1978528" which is longer than "00:00:01". This could be caused by thread pool starvation.

Do you have any idea regarding the error message? or do you know about any special trick or dependency which is need for the deployment?

Thanks your response in advance