How long does it take to evaluate on MS-MARCO?

sleeepeer / PoisonedRAG

[USENIX Security 2025] PoisonedRAG: Knowledge Corruption Attacks to Retrieval-Augmented Generation of Large Language Models

https://arxiv.org/abs/2402.07867

MIT License

73 stars 10 forks source link

How long does it take to evaluate on MS-MARCO? #7

Open c0ding4ever opened 1 month ago

c0ding4ever commented 1 month ago

Evaluating on MS-MARCO seems to take significantly a lot more time than NQ or Hotpot QA, i.e., it just hangs there:

Loading checkpoint shards: 0%| | 0/2 [00:00<?, ?it/s] Loading checkpoint shards: 50%|█████ | 1/2 [00:27<00:27, 27.73s/it]

I was wondering if the authors (or anyone else running the repo) encountered similar issues? Can this be resolved by waiting things out?

sleeepeer commented 1 month ago

Hi, thanks for pointing it out. How is everything going now? Did it be resolved? Based on the information you provided I think that you stuck in loading the model checkpoint, which is not related to MS-MARCO dataset.

c0ding4ever commented 1 month ago

No, unfortunately the problem persists. I made several attempts, and every time it ran for more than 10 hours on an A6000 and terminated while still loading (as shown above).

sleeepeer commented 1 month ago

It looks strange... Does NQ or HotpotQA work well? Just in MS-MARCO you have this issue?

c0ding4ever commented 1 month ago

Yes, NQ and HotpotQA both finished running. Only MS-MARCO didn't work.

sleeepeer commented 4 weeks ago

Hi, I replicated MS-MARCO experiments and it runs well. My device is a single A100 with 80GB VRAM, and 1TB RAM. Maybe the issue comes from the larger memory usage of MS-MARCO.

c0ding4ever commented 3 weeks ago

Yes that might be the case. Thank you for looking into this!

I do not have access to A100 so I cannot test it out. Maybe someone else working with MS-MARCO could verify?

sleeepeer commented 2 weeks ago

Yeah I think we could leave this issue here to see if someone can help.