chartbeat-labs / textacy

NLP, before and after spaCy
https://textacy.readthedocs.io
Other
2.21k stars 249 forks source link

always UNICODE errors #282

Closed shahmohamadi closed 4 years ago

shahmohamadi commented 4 years ago

I can't do anything with textacy.preprocess functions. all of them give some UNICODE related error to me even simple examples like this one:

raw_text = """ The best programs, are the ones written when the programmer is supposed to be working on something else.Mike bought the book for $50 although in Paris it will cost $30 dollars. Don’t document the problem, fix it.This is from https://twitter.com/codewisdom?lang=en. """

textacy.preprocess.remove_punct(raw_text)

<class 'tuple'>: (<class 'AttributeError'>, AttributeError("module 'textacy.constants' has no attribute 'PUNCT_TRANSLATE_UNICODE'",), <traceback object at 0x7f2957df9248>)

bdewilde commented 4 years ago

Hi @shahmohamadi , which version of textacy and Python are you using? In the latest version, textacy.preprocess.remove_punct() no longer exists...

mzeidhassan commented 4 years ago

@shahmohamadi You may want to check this URL: https://chartbeat-labs.github.io/textacy/api_reference/text_processing.html?highlight=remove_punct#textacy.preprocessing.remove.remove_punctuation

shahmohamadi commented 4 years ago

@bdewilde @mzeidhassan Thank you. it fixed.