memray / OpenNMT-kpg-release

Keyphrase Generation
MIT License
217 stars 34 forks source link

Confusion of single-word phrase #5

Closed shizhediao closed 4 years ago

shizhediao commented 4 years ago

Hi, thanks for your great work Deep Keyphrase Generation I was really confused about the statement in section 4.3:

In the generation of keyphrases, we find that the model tends to assign higher probabilities for shorter keyphrases, whereas most keyphrases contain more than two words. To resolve this problem, we apply a simple heuristic by preserving only the first single-word phrase and removing the rest. What does that mean "preserve only the first single-word"? Is it a post-processing step? Could you give an example? Thanks so much!

memray commented 4 years ago

It is a post-process (throw all single-word phrases away but the 1st one) and this heuristic is not used any longer. We found in many cases models output better results when this heuristic is not applied. Hope this answers your question.

Rui