LIAAD / yake

Single-document unsupervised keyword extraction
https://liaad.github.io/yake
Other
1.64k stars 227 forks source link

Discarding keywords/-phrases with certain POS tag #47

Closed lisabecker closed 3 years ago

lisabecker commented 3 years ago

Dear authors, Firstly, I want to thank you for the great work you're doing!

I wonder what would be the best practice do detect and discard keyphrases that are (not) of a specific POS tag when using YAKE.

More specifically, I need to discard names of people and numbers for a project. I could do that after YAKE extracted them from my corpus, but I assume it would be more efficient to not even build/include the key phrases when they're of a specific POS tag.

Thanks in advance for any hints/ideas!

rncampos commented 3 years ago

Dear Lisa, Sorry for my late reply. Busy days. Thanks for your kind words and for using YAKE! If I understood your question you are interested in discarding names of people (even if they are detected as a keyphrase) by YAKE. While you could do it directly on YAKE (by making a branch and adapting its code) it will probably make more sense to add a further post-processing layer where a POS filter (such as Spacy) can be applied. Best Ricardo

lisabecker commented 3 years ago

Dear Ricardo, many thanks for your reply and suggestions. I indeed went for the latter approach! Best, Lisa