chartbeat-labs / textacy

NLP, before and after spaCy
https://textacy.readthedocs.io
Other
2.21k stars 249 forks source link

to_bag_of_terms() got an unexpected keyword argument 'ngrams' #335

Closed paulakeen closed 2 years ago

paulakeen commented 3 years ago

I've installed the latest version of textacy, and the code I had working in previous versions of textacy stopped working properly when it comes to extract bags of terms. I keep getting the error:

    bot = doc._.to_bag_of_terms(ngrams=(1,2, 3), entities = True, weighting = "count", as_strings = True)
TypeError: to_bag_of_terms() got an unexpected keyword argument 'ngrams'

I've tried several times in two different OS and got the same behaviour. The code I used to test this is very simple and extracted from the documentation:

text = 'This is the text from a document that would have been extracted previously.'
en = textacy.load_spacy_lang("en_core_web_sm", disable=("parser",))
doc = textacy.make_spacy_doc(text, lang=en)
doc = en(text)
print(doc._.preview)
bot = doc._.to_bag_of_terms(ngrams=(1,2, 3), entities = True, weighting = "count", as_strings = True)
mzeidhassan commented 3 years ago

Hi @paulakeen , this code seems to be working fine at my end. It seems the quickstart page needs to be updated to match what exists at (https://textacy.readthedocs.io/en/latest/api_reference/lang_doc_corpus.html?highlight=doc._.to_bag_of_terms(ngrams%3D(1%2C2%2C%203)%2C%20entities%20%3D%20True%2C%20weighting#textacy.extensions.to_bag_of_terms)

image

Hope this helps.

bdewilde commented 3 years ago

Hi @paulakeen , @mzeidhassan is correct — the API for this method changed in the latest release, and I mistakenly failed to update it in the quickstart doc. The API reference linked to above is correct! I'll see about updating the quickstart sometime soon.