GPU processing problem - Githubissues

@Teddy-XiongGZ Thank you for your work.

I have some questions about processing the entire corpus (MedCorp) with a GPU. It seems that combining all FAISS index would not fit into a single GPU, as the size would exceed 80GB (H100 for our case). For instance, after indexing Wikipedia, each index size is about 85GB.

Additionally, I couldn't find where the code uses the GPU in files such as medrag.py or utils.py. The requirements.txt also doesn't mention faiss-gpu, so I wonder if I need to modify the code to use the GPU myself.

Secondly, I am trying to use MedRAG as a baseline for our project. However, due to the above issue, it takes an excessive amount of time to retrieve using rrf4 on MedCorp (more than 1000 hours). Could you share the retrieved evidence of rrf4 with MedCorp for benchmark datasets (MedQA, MedMCQA, etc.) mentioned in your paper?

Thank you.

Teddy-XiongGZ / MedRAG

GPU processing problem #10