DerwenAI / pytextrank

Python implementation of TextRank algorithms ("textgraphs") for phrase extraction
https://derwen.ai/docs/ptr/
MIT License
2.15k stars 333 forks source link

Issues with adding pytextrank to spacy pipeline #100

Closed dbragdon1 closed 3 years ago

dbragdon1 commented 3 years ago

Hello, this is a great feature and everything was working smoothly until I upgraded spacy to 3.0.0 and pytextrank 2.1.0. I believe it has to do with how spacy changed their methods for constructing custom pipelines.

I'm trying to run the code given in the readme:

import spacy
import pytextrank

nlp = spacy.load('en_core_web_sm')
tr = pytextrank.TextRank()
nlp.add_pipe(tr.PipelineComponent, name = 'textrank', last = True)

This gives me a ValueError:

ValueError: [E966] `nlp.add_pipe` now takes the string name of the registered component factory, not a callable component. 

Expected string, but got <bound method TextRank.PipelineComponent of <pytextrank.pytextrank.TextRank object>> (name: 'textrank').

The error message goes on to explain what spacy expects for each argument, and how to add decorators to create custom pipelines and factories.

Is there a fix for this? Am I doing something wrong? Any help would be appreciated. Thank you.

ceteri commented 3 years ago

Hi @dbragdon1, many thanks -

Yes, the means for constructing pipelines changed, and we needed to move to becoming a component factory instead. To run spaCy 3.x you need to use pytextrank 3.x now.

Have you seen the #99 issue? That has some more details

dbragdon1 commented 3 years ago

Ah I see. I think I assumed I was on the latest version after upgrading through pip. Thanks for your help. Keep up the good work!

ceteri commented 3 years ago

Thank you kindly!

FWIW, the pip install -U pytextrank should install the 3.0.1 version -- although I've encountered lots of oddness lately with both pip and conda and there seems to be more sensitivity about having those tools themselves updated.