PrithivirajDamodaran / FlashRank

Lite & Super-fast re-ranking for your search & retrieval pipelines. Supports SoTA Listwise and Pairwise reranking based on LLMs and cross-encoders and more. Created by Prithivi Da, open for PRs & Collaborations.
Apache License 2.0
441 stars 37 forks source link

onnxruntime.capi.onnxruntime_pybind11_state.NoSuchFile: [ONNXRuntimeError] : 3 : NO_SUCHFILE #15

Closed anudit closed 1 month ago

anudit commented 1 month ago

I run into the following error on flashrank==0.2.4

onnxruntime.capi.onnxruntime_pybind11_state.NoSuchFile: [ONNXRuntimeError] : 3 : NO_SUCHFILE : Load model from /tmp/ms-marco-TinyBERT-L-2-v2/flashrank-TinyBERT-L-2-v2.onnx failed:Load model /tmp/ms-marco-TinyBERT-L-2-v2/flashrank-TinyBERT-L-2-v2.onnx failed. File doesn't exist

Manually changing the cache dir from /tmp to ./ fixes this. Would love to be able to do this directly from langchain from langchain.retrievers.document_compressors import FlashrankRerank

Passing a client manually also doesn't work as it downloads the model correctly but still seems to be reading from the /tmp directory.

compressor = FlashrankRerank(
    model="ms-marco-TinyBERT-L-2-v2", 
    top_n=5,
    client=Ranker(model_name="ms-marco-TinyBERT-L-2-v2",cache_dir='./')
)
walterheck commented 1 month ago

I am facing the same issue

PrithivirajDamodaran commented 1 month ago

@anudit - Thanks for reaching out. FlashRank exposes cache_dir as a init param. While '/tmp' is considered as a sensible default (as model files will be automatically cleaned upon restart), people running lambda like serverless cannot use /tmp hence cache_dir is exposed.

That being said I don't own the Langchain integration of FlashRank, so I neither can comment (on why it is not letting you use the parameters like cache_dir upon init) nor I can reproduce it at FlashRank end. It is a good idea to open an issue with Langchain to modify the integration so it allows devs to take advantage of the cache_dir init param.