Tswings / QDAMR4QA

Interpretable AMR-Based Question Decomposition for Multi-hop Question Answering
4 stars 0 forks source link

Wrong results for QD_comp.py #2

Open xiu-ze opened 6 months ago

xiu-ze commented 6 months ago

When I ran QD_comp.py, I obtained identical subQ1 and subQ2 results. Below is the example I used along with the results I got.

# ori data from HotpotQA-hotpot_dev_distractor_v1.json
  {
    "_id": "5a8b57f25542995d1e6f1371",
    "answer": "yes",
    "question": "Were Scott Derrickson and Ed Wood of the same nationality?",
    "supporting_facts": [
      ["Scott Derrickson",0],
      ["Ed Wood",0]
    ],
    "context": [
     ......
    ],
    "type": "comparison",
    "level": "hard"
  }

and here is my result

{
    "key": "5a8b57f25542995d1e6f1371",
    "subQ1": [
        "Scott Derbyson and Ed Wood are the same nationality?"
    ],
    "sec_unknown": "comparison",
    "subQ2": [
        "Scott Derbyson and Ed Wood are the same nationality?"
    ],
    "ques": "Were Scott Derrickson and Ed Wood of the same nationality?"
}

I have tested the first 100 entries from the HotpotQA-hotpot_dev_distractor_v1.json file, and among them, there are 20 questions of the 'comp' type. However, the results for the subQ1 and subQ2 of these 20 questions are nearly identical.