haitian-sun / GraftNet

BSD 2-Clause "Simplified" License
268 stars 56 forks source link

why the fact files of WQSD do not include the answer and question entities #23

Closed dinani65 closed 3 years ago

dinani65 commented 3 years ago

I have checked some of fact_files of the questions and they did not include any triple regarding the question entity or answer entity. I except that the fact_files include the OracleEntities and Answers at least for some them. For example, I except that "fb:m.06w2sn5" and "fb:m.0gxnnwq" exist in "WebQTest-0.nxhd". Could u please explain for me if I am missing something?

haitian-sun commented 3 years ago

We did not use the oracle entities from the question to construct the subgraph for each question. We used the ones linked by STAGG instead. The subgraph is generated with Personalized Pagerank. It is likely some triples that contain answers are missing. In fact, we observe the answer recall is ~90%.

On Apr 15, 2021, at 10:13 AM, dinani65 @.***> wrote:

I have checked some of fact_files of the questions and they did not include any triple regarding the question entity or answer entity. I except that the fact_files include the OracleEntities and Answers at least for some them. For example, I except that "fb:m.06w2sn5" and "fb:m.0gxnnwq" exist in "WebQTest-0.nxhd". Could u please explain for me if I am missing something?

— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub https://github.com/OceanskySun/GraftNet/issues/23, or unsubscribe https://github.com/notifications/unsubscribe-auth/ADE5XLY63LI3MBK6V7CD3Q3TI3X67ANCNFSM427SZKGQ.

dinani65 commented 3 years ago

Thanks for your reply. You have created subgraphs over the fact files. right? Could u please tell how fact_files (question-wise subgraphs in files named

.nxhd) have been created?
dinani65 commented 3 years ago

I already wrote a simple script to calculate how many fact_files include the answer entities. 4591 files include the answers while the total number of fact_files is 4727.

dinani65 commented 3 years ago

Thanks for your reply. You have created subgraphs over the fact files. right? Could u please tell how fact_files (question-wise subgraphs in files named .nxhd) have been created?

I greatly appreciate it if u could answer my question

haitian-sun commented 3 years ago

This is what we said in our paper.

"We use the entity linking outputs from S-MART5 and retrieve 500 entities from the neighborhood around the question seeds in Freebase to populate the question subgraphs6. We further retrieve the top 50 sentences from Wikipedia with the two-stage process described in §2. The overall recall of answers among the subgraphs is 94.0%."

haitian-sun commented 3 years ago

You may find this page helpful. http://curtis.ml.cmu.edu/kbir/