hotpotqa / hotpot

Apache License 2.0
445 stars 75 forks source link

Selecting answer span during preprocessing #17

Closed valsworthen closed 5 years ago

valsworthen commented 5 years ago

Hello,

I have trouble understanding how exactly the preprocessing script selects the answer spans from the answer text. From what I understand, the function fix_span loops over all matches of the answer text and tries to select the one that better "sticks" to individual tokens. In the case where the answer perfectly matches some tokens, the first occurrence is returned.

Is that right?

Thanks!

qipeng commented 5 years ago

Yes, I think that's correct. Although one could argue whether the first match is always ideal, I agree.