chartbeat-labs / textacy

NLP, before and after spaCy
https://textacy.readthedocs.io
Other
2.22k stars 250 forks source link

GroupVectorizer- NameError: name 'doc_term_matrix' is not defined #215

Closed NudnikShpilkis closed 5 years ago

NudnikShpilkis commented 5 years ago

The GroupVectorizer returns a name error

Expected Behavior

The GroupVectorizer should return a Group Vectorizer not an error

Current Behavior

Calling a GroupVectorizer with docs and groups returns a NameError: name 'doc_term_matrix' is not defined

Possible Solution

Change the reference on line 1025 of vectorizers.py from doc_term_matrix to grp_term_matrix

Steps to Reproduce (for bugs)

from textacy.vsm import GroupVectorizer

docs= [["hello", "my", "name", "is", "john smith"], ["hello", "my", "name", "is", "king", "george"], ["the", "lazy", "fox", "jumped", "over", "the", "dog"]]

groups = ["intro", "intro", "arb"]

GroupVectorizer(tf_type='bm25', apply_idf=True, idf_type='smooth', apply_dl=True, dl_type='linear').fit(docs, groups )

Context

Your Environment

Python 3.5.5, textacy 0.6.2

bdewilde commented 5 years ago

Hi @isaachaberman , thanks for catching this! Pushing a fix (and test) momentarily.