zhiguowang / BiMPM

BiMPM: Bilateral Multi-Perspective Matching for Natural Language Sentences
Apache License 2.0
438 stars 150 forks source link

no match for some QQP IDs #58

Closed prpfialho closed 5 years ago

prpfialho commented 5 years ago

Hi,

The ID for some of the examples in your split of Quora do not match the original Quora file, e.g, the first line of your test.tsv is:

1 What should I do to avoid sleeping in class ? How do I not sleep in a boring class ? 50018

and, in the original Quora file at: https://data.quora.com/First-Quora-Dataset-Release-Question-Pairs

id 50018 corresponds to:

50018 35003 17537 What is the cheapest, painless, easiest way to commit suicide? What is the cheapest method to commit suicide? 1

some IDs match, such as the first example in train.tsv.

What happened? How can I relate the original Quora data and your partition?

Best,

Deep1994 commented 4 years ago

Hi , have you sloved this problem?