facebookresearch / tart

Code and model release for the paper "Task-aware Retrieval with Instructions" by Asai et al.
Other
159 stars 11 forks source link

Lost instructions #13

Open linzhu1967 opened 9 months ago

linzhu1967 commented 9 months ago

Thanks for sharing the code of TART publicly! I would like to split the data in the tart_full training set (st_train_ranker_input.json) into each source dataset by matching the instruction to the dataset name, based on this file (https://github.com/facebookresearch/tart/blob/main/BERRI/berri_instructions.tsv).

However, I found that some instructions are not in the file and it is difficult to determine which dataset the instructions belong to. For example, "Retrieve passages from Wikipedia to answer the following question" and "Retrieve passages from Wikipedia to answer".

Besides, I have discovered a similar issue https://github.com/facebookresearch/tart/issues/8#issuecomment-1591062115.

Please help me solve this issue, I would be very grateful!