Reranker shouldnt be reinitialized at every pipeline function call. Even though it doesnt consume more CUDA or MPS memory, it still slows down the entire retrieval process.
Proposed fix: Create a separate pipeline class for amr, swr, fusion such that everytime rerank is called, we can call self.reranker.rerank_top_k(). Idk how will this impact the backend side.
https://github.com/Capstone-S17/DuRAG/blob/5014a117d045dd164189b1f3bfa0d66d8d6da320/src/pipelines/rag_swr.py#L50-L59 https://github.com/Capstone-S17/DuRAG/blob/5014a117d045dd164189b1f3bfa0d66d8d6da320/src/pipelines/rag_amr.py#L42-L78 https://github.com/Capstone-S17/DuRAG/blob/5014a117d045dd164189b1f3bfa0d66d8d6da320/src/pipelines/rag_fusion.py#L52-L60
Reranker shouldnt be reinitialized at every pipeline function call. Even though it doesnt consume more CUDA or MPS memory, it still slows down the entire retrieval process.
Proposed fix: Create a separate pipeline class for amr, swr, fusion such that everytime rerank is called, we can call self.reranker.rerank_top_k(). Idk how will this impact the backend side.