Hi, thanks for constructing and sharing the KILT corpus. I am using python to do some pre-processing work and I can't align the passage_id from the anchor field to the elements in the text field. In other words, the passage with passage_id i usually is not the i-th element in the text field.
I wonder if there are any rules to distinguish the passage id (presented in anchor) in the text field?
Hi, thanks for constructing and sharing the KILT corpus. I am using python to do some pre-processing work and I can't align the
passage_id
from the anchor field to the elements in the text field. In other words, the passage with passage_idi
usually is not thei
-th element in the text field.I wonder if there are any rules to distinguish the passage id (presented in anchor) in the text field?
Thanks.