DerwenAI / pytextrank

Python implementation of TextRank algorithms ("textgraphs") for phrase extraction
https://derwen.ai/docs/ptr/
MIT License
2.15k stars 333 forks source link

Ignore tokens and enrich the lemma graph #63

Closed Albertobegue closed 1 year ago

Albertobegue commented 4 years ago

Hi everyone!

It is mentioned in the project's description that enriching the lemma graph would improve TextRank's performance. I saw that showing examples of this was in the todo list of the project but I was wondering if it worked by simply adding entities to the doc before summarising? Or is it more complicated? I am particularly interested in adding hyponymy.

And what about ignoring tokens? Some tokens are ignored depending on their POS tag in your implementation. Is it possible to ignore tokens specific to our application by tagging them? With what?

Thanks in advance for your answers!!

And thank you for this project, it is great!

ceteri commented 3 years ago

Thank you @Albertobegue we're working toward these capabilities with the kglab integration.