abount QA - Githubissues

geek-ai / irgan

IRGAN SIGIR paper experimental code

623 stars 243 forks source link

abount QA #19

Closed facingwaller closed 6 years ago

facingwaller commented 6 years ago

请教一下

既然QA的G产生的NEG answer来自候选里面的根据CNN模型挑选的一部分。如果把全部的候选都给D做判别效果是否会更好？只是比较花时间。比如从示例代码中，G的过程中，从100个里面用CNN识别后选出了5个，再给D用CNN训练这个5个。为什么不直接把这100个都拿去给D训练？（花的时间更少？）

wnzhang commented 6 years ago

When you are optimizing top-k performance (instead of the global pairwise ranking performance like AUC) you normally need to focus on the ranking of top k-2k instead of the overall ranking. Thus you need to sample top-ranked negative samples instead of using all the negative samples.

facingwaller commented 6 years ago

谢谢回答，请问为什么会只需要关注前K-2K的相关的负例？有什么理论依据吗？直觉上来说，不应该是训练了更多的负例，效果更好吗？还是只是把NN模型当做一个黑盒子，从系统的外部上发现了这个规律并不能实证？

wnzhang commented 6 years ago

You can check out this paper to see why :) http://wnzhang.net/papers/lambdarankcf-sigir.pdf

facingwaller commented 6 years ago

谢谢回答。我去学习一下这篇论文再向您请教。