PaulLerner / ViQuAE

Source code and data used in the papers ViQuAE (Lerner et al., SIGIR'22), Multimodal ICT (Lerner et al., ECIR'23) and Cross-modal Retrieval (Lerner et al., ECIR'24)
https://paullerner.github.io/ViQuAE/
Other
24 stars 2 forks source link

oracle passage #5

Open xiewen354 opened 2 months ago

xiewen354 commented 2 months ago

Hello, I have a question about your experiment in the ECIR paper, where did the oracle come from? I didn't see the id field in the passage on huggingface, how do I know which passage is really needed for each question? Or to use the entire wikidata data in QA, that's too long

PaulLerner commented 2 months ago

Hi, thank you for your interest!

The reading comprehension results are perhaps best described in my previous paper ViQuAE, a Dataset for Knowledge-based Visual Question Answering about Named Entities (Lerner et al., SIGIR’22).

You can reproduce the results as instructed here https://paullerner.github.io/ViQuAE/#fine-tuning-on-viquae-viquae with the oracle flag. The annotation comes from the best IR model but filters out irrelevant passages

PaulLerner commented 1 month ago

Hi, a bit late but the filtering of model output to oracle passage is described here https://paullerner.github.io/ViQuAE/#find-relevant-passages-in-the-ir-results