roshan-research / hazm

Persian NLP Toolkit
https://www.roshan-ai.ir/hazm/
MIT License
1.2k stars 179 forks source link

nltk issue may cause error on your library(MACOSX) #291

Closed Amirlashkar closed 1 year ago

Amirlashkar commented 1 year ago

i already had both nltk and hazm installed; for using each of them had same error; error below is for hazm usage:

['Credit Risk', 'Supply Risk'] Traceback (most recent call last): File "/Users/albk/Documents/Code/LearningProjects/NLP/hazm.py", line 1, in from hazm import File "/Users/albk/anaconda3/lib/python3.10/site-packages/hazm/init.py", line 20, in from hazm.pos_tagger import POSTagger File "/Users/albk/anaconda3/lib/python3.10/site-packages/hazm/pos_tagger.py", line 10, in from nltk.tag import stanford File "/Users/albk/anaconda3/lib/python3.10/site-packages/nltk/init.py", line 138, in from nltk.text import File "/Users/albk/anaconda3/lib/python3.10/site-packages/nltk/text.py", line 29, in from nltk.tokenize import sent_tokenize File "/Users/albk/anaconda3/lib/python3.10/site-packages/nltk/tokenize/init.py", line 65, in from nltk.tokenize.casual import TweetTokenizer, casual_tokenize File "/Users/albk/anaconda3/lib/python3.10/site-packages/nltk/tokenize/casual.py", line 215, in HANG_RE = regex.compile(r"([^a-zA-Z0-9])\1{3,}") AttributeError: module 'regex' has no attribute 'compile'

i'm on mac and it seems thats relevent to error;

solution: just go to "nltk/tokenize/casual.py" file and replace all "regex" things with "re" to make nltk use built-in regex !!!

sir-kokabi commented 1 year ago

سلام. از کدوم نسخهٔ هضم و nltk استفاده می‌کنید؟

Amirlashkar commented 1 year ago

سلام. از کدوم نسخهٔ هضم و nltk استفاده می‌کنید؟

Name: hazm Version: 0.9.3

Name: nltk Version: 3.8.1

sir-kokabi commented 1 year ago

لطفاً یک نمونه کد کوتاه (ترجیحاً روی colab) که این خطا رو تولید می‌کنه به اشتراک بگذارید.

imani commented 1 year ago

سلام. از کدوم نسخهٔ هضم و nltk استفاده می‌کنید؟

Name: hazm Version: 0.9.3

Name: nltk Version: 3.8.1

نسخه regex تون رو چک کنید، روی نسخه 2.5.129 چک کردم خطایی نداشتم.