chartbeat-labs / textacy

NLP, before and after spaCy
https://textacy.readthedocs.io
Other
2.21k stars 249 forks source link

'divrank' and 'bestcoverage' not working #199

Closed mukund109 closed 6 years ago

mukund109 commented 6 years ago

Current Behavior

The 'rank_nodes_by_divrank' and 'rank_nodes_by_bestcoverage' functions are trying to '#replace node number by node value' by calling a dict constructor incorrectly.

Steps to Reproduce (for bugs)

import textacy
import textacy.keyterms as keyterms

text = """Provide a convenient entry point and interface to one or many documents, with the core processing delegated to spaCy. Stream text, json, csv, spaCy binary"""

doc = textacy.Doc(text, lang='en')
keyterms.key_terms_from_semantic_network(doc, ranking_algo='divrank')

raises:

ValueError: dictionary update sequence element #0 has length 10; 2 is required

at:

File "textacy/keyterms.py", line 717, in rank_nodes_by_divrank
    nodes_list = list(dict(graph.nodes()))

Your Environment

mukund109 commented 6 years ago

It seems like this has been reported before, and even though a fix was made, the networkx requirements weren't updated.

I checked and it works with networkx 2.1

So I think updating your requirement.txt might fix it, unless there is any other code left that is only compatible with networkx 1.11

bdewilde commented 6 years ago

Hey @mukund109 , thanks for the bug report. I just pushed a fix (I hope) to master. If it doesn't resolve your issue, please reopen and let me know!