JasonKessler / scattertext

Beautiful visualizations of how language differs among document types.
Apache License 2.0
2.25k stars 292 forks source link

AttributeError: module 'pytextrank' has no attribute 'TextRank' #92

Closed Anthonyive closed 3 years ago

Anthonyive commented 3 years ago

Steps to Reproduce

corpus = st.CorpusFromParsedDocuments(
    convention_df,
    category_col='category',
    parsed_col='parse',
    feats_from_spacy_doc=st.PyTextRankPhrases()
).build(
).compact(
    AssociationCompactor(2000, use_non_text_features=True)
)

Error:

---------------------------------------------------------------------------
AttributeError                            Traceback (most recent call last)

...

~/.local/share/virtualenvs/.../lib/python3.8/site-packages/scattertext/features/PyTextRankPhrases.py in get_doc_metadata(self, doc)
     32         import pytextrank
     33         phrase_counter = Counter()
---> 34         tr = pytextrank.TextRank()
     35         tr.doc = doc
     36         phrases = tr.calc_textrank()

AttributeError: module 'pytextrank' has no attribute 'TextRank'

Expected behavior

No error.

Environment

Additional context

Looks like pytextrank no longer has the attribute TextRank. Maybe you should change the code in scattertext/features/PyTextRankPhrases.py from tr = pytextrank.TextRank() to spacy pipeline.

JasonKessler commented 3 years ago

Thanks for the bug report. I ran into some issues using PyTextRank to parse multiple documents with the same spaCy Language instance. Not sure if these persist in PTR v3, but the easiest thing to do would be to explicitly use an earlier version.

JasonKessler commented 3 years ago

I just released v0.1.2 which enables Scattertext to work with PTR v3. The results on the convention data set look a little better using the new version.

hmltn-0 commented 2 years ago

How’s this going? I just got the same error, following the documentation:

https://spacy.io/universe/project/spacy-pytextrank

pytextrank has no attribute “TextRank”. It doesn’t appear in dir(pytextrank):

['BaseTextRank', 'BaseTextRankFactory', 'BiasedTextRank', 'BiasedTextRankFactory', 'Language', 'Lemma', 'MIN_PY_VERSION', 'Paragraph', 'Phrase', 'PositionRank', 'PositionRankFactory', 'Sentence', 'StopWordsLike', 'VectorElem', '_DEFAULT_CONFIG', 'builtins', 'cached', 'doc', 'file', 'loader', 'name', 'package', 'path', 'spec', 'version', '_check_version', '_create_component_br', '_create_component_pr', '_create_component_tr', '_versify', 'base', 'biasedrank', 'default_scrubber', 'filter_quotes', 'groupby_apply', 'maniacal_scrubber', 'pathlib', 'positionrank', 'split_grafs', 'typing', 'util', 'version']

I guess that Spacy page is just outdated because the official PTR GitHub has different instructions.

JasonKessler commented 2 years ago

It looks like the spaCy documentation isn't current with the latest version of PyTextRank. You may want to raise an issue with them.

Scattertext's documentation, however, is current. Please follow that.

On Tue, Jan 18, 2022 at 6:52 AM Julius Hamilton @.***> wrote:

How’s this going? I just got the same error, following the documentation:

https://spacy.io/universe/project/spacy-pytextrank

pytextrank has no attribute “TextRank”. It doesn’t appear in dir(pytextrank):

['BaseTextRank', 'BaseTextRankFactory', 'BiasedTextRank', 'BiasedTextRankFactory', 'Language', 'Lemma', 'MIN_PY_VERSION', 'Paragraph', 'Phrase', 'PositionRank', 'PositionRankFactory', 'Sentence', 'StopWordsLike', 'VectorElem', '_DEFAULT_CONFIG', 'builtins', 'cached', 'doc', 'file', 'loader', 'name', 'package', 'path', 'spec', 'version', '_check_version', '_create_component_br', '_create_component_pr', '_create_component_tr', '_versify', 'base', 'biasedrank', 'default_scrubber', 'filter_quotes', 'groupby_apply', 'maniacal_scrubber', 'pathlib', 'positionrank', 'split_grafs', 'typing', 'util', 'version']

— Reply to this email directly, view it on GitHub https://github.com/JasonKessler/scattertext/issues/92#issuecomment-1015488284, or unsubscribe https://github.com/notifications/unsubscribe-auth/AACMMXC66G6OSZZDFQEB7ULUWV5BJANCNFSM4YYRBG2A . Triage notifications on the go with GitHub Mobile for iOS https://apps.apple.com/app/apple-store/id1477376905?ct=notification-email&mt=8&pt=524675 or Android https://play.google.com/store/apps/details?id=com.github.android&referrer=utm_campaign%3Dnotification-email%26utm_medium%3Demail%26utm_source%3Dgithub.

You are receiving this because you modified the open/close state.Message ID: @.***>

Dopamine-TX commented 2 years ago

nlp.add_pipe("textrank")