LinWeizheDragon / FLMR

The huggingface implementation of Fine-grained Late-interaction Multi-modal Retriever.
40 stars 2 forks source link

what is pos_item_ids and pos_item_contents mean? #21

Open zzk2021 opened 5 days ago

zzk2021 commented 5 days ago

image

LinWeizheDragon commented 5 days ago

pos_item_ids is the list of document ids (document indices in the corpus for retrieval). They are the ids of documents that are considered "relevant" to the query. They are typically the ground-truth annotations in those retrieval datasets. pos_item_contents are just the content of these document ids. They are pre-extracted for convenience.