blmoistawinde / KPCNet

Code for the WWW 2021 paper: Diverse and Specific Clarification Question Generation with Keywords
11 stars 4 forks source link

Questions from a researchers #7

Open githubzmt opened 2 years ago

githubzmt commented 2 years ago

Hello, I found that when reproducing the experimental results, using the real keyword "true" did not achieve the desired experimental effect. The experimental results are too different from the results in the paper. What is the reason?

blmoistawinde commented 2 years ago

Hi,

Thanks for your question. I not quite sure what do you mean by

the real keyword "true" did not achieve the desired experimental effect

Are you using "true" as the keyword for generation?

Can you provide more details on how you conduct your experiment (like bash script and files), and what are your experimental results (output file or metrics) ? So that we can figure out the problem.

githubzmt commented 2 years ago

Hello, thank you very much for your reply. When reproducing the experimental results of your paper, I found that the experimental results of the two data sets are too different from those of my own reproduction. Kpcnet (truth) 37.38 23.63 19.38 cannot be reproduced on home&kitchen. Using all the codes and data you provide, only kpcnet (truth) 16.2 15.68 14.85 can be reproduced. Using the model you provide, it can only be relatively higher. After thinking for a long time, I don't understand the specific reason. At present, it can only reach the result of kpcnet 15.30 17.77 16.18. Would you like to ask your opinion?

blmoistawinde commented 2 years ago

Hi, it seems that you've reproduced the basic KPCNet, but not KPCNet(truth).

To reproduce the "truth" setting, in the predict script, you should remove all these args

--load_kwd_edge_dir ./data/kwd_edges.npz \
--load_filter_dir ./data/kwd_filter_dict.json \
--cluster_kwd \
--threshold 0.075 \
--sample_top_k 6 \
--kwd_clusters 2 \
--sample_times 2

and add this --decode_use_kwd_label

In the code, --decode_use_kwd_label is mutually exclusive with the above args, so if you use them, the true keywords will not be used. Then you should get similar results as I reported in the paper.

githubzmt commented 2 years ago

Hi, thank you very much for answering my questions