DerwenAI / pytextrank

Python implementation of TextRank algorithms ("textgraphs") for phrase extraction
https://derwen.ai/docs/ptr/
MIT License
2.15k stars 333 forks source link

custom Keyword inclusion #79

Closed Vignesh9395 closed 1 year ago

Vignesh9395 commented 3 years ago

Problem description

My requirement is, the generated summary should have specific keywords from the input text.

Steps/code/corpus to reproduce

I need the pipeline component to accept keywords as input parameter.

nlp.add_pipe(tr.PipelineComponent, name="textrank", last=True, custom_keywords=keywords)

For example,

import spacy
import pytextrank

# example text
text = "Apple is red. Grape is black. Banana is yellow."

# keywords
keywords = ['apple', 'red', 'yellow']

# load a spaCy model, depending on language, scale, etc.
nlp = spacy.load("en_core_web_sm")

output = summarize(text, word_count=9, custom_keywords=keywords)

# add PyTextRank to the spaCy pipeline
tr = pytextrank.TextRank()
nlp.add_pipe(tr.PipelineComponent, name="textrank", last=True, custom_keywords=keywords)

doc = nlp(text)

# examine the top-ranked sentences in the document
for sent in doc._.textrank.summary(limit_phrases=15, limit_sentences=2):
    print(sent)

Output

Apple is red. Banana is yellow

As in above example, I need a parameter to include custom keywords and those keywords must be present in the summarized text. (i.e) The sentences with the keywords should be the top ranked sentences.

Is there a way to do this? or any function that does this present as part of the library?

ceteri commented 3 years ago

Thank you @Vignesh9395, that capability is going in with the upcoming kglab integration.

Vignesh9395 commented 3 years ago

Thank you @ceteri , looking forward!

Ankush-Chander commented 3 years ago

Hi @Vignesh9395 , Can you try your use case with biased textrank. I think with an appropriate choice of focus and bias you should be able to bring such sentences on top of the summary. Please refer sample.py for the usage.

@ceteri