Closed upasana-mittal closed 2 years ago
I'm getting the same error...does anyone know what's wrong?
Reason for KeyError: Pke library requires nltk library for the language codes. In pke's "langcodes.py" there is absence of language code for 'hinglish'.
Solution: In the home location, the "nltk_data" folder will be present. Inside nltk_data/corpora/stopwords there will be file named as 'hinglish'. Just remove that file from that folder and your error will be taken care of.
where to get "nltk_data" folder in colab?
where to get "nltk_data" folder in colab?
Check the path where nltk is downloading. Normally it is stored in the /root/ directory. You can access the root directory on the left side of the colab pane by clicking on "..." which means more options. It is visible beside the sample.
you can simply do !rm /root/nltk_data/corpora/stopwords/hinglish
btw removing did not worked for me
btw i did not face the issue with latest version
I had issue because I will installing on commit hash but since I switched to full git, it is working fine. no more error
pip install git+https://github.com/boudinfl/pke.git
As said earlier in the thread, please update to the latest version.
If you are using pke
with an unsupported language please provide custom stopwords using stoplist
argument as such:
shadok_stoplist = ['ga', 'zo']
preprocessed_document = [ # Obtained via custom pos tagging tool or manual annotation
[('ga', 'DET'), ('bu', 'NOUN'), ('zo', 'AUX'), ('meu', 'ADJ'), ('.', 'PUNCT')]
]
e = pke.unsupervised.MultipartiteRank()
e.load_document(
preprocessed_document, language='shadok',
stoplist=shadok_stoplist, normalization=None)
I am getting this error while importing pke
get_alpha_2 = lambda l: LANGUAGE_CODE_BY_NAME[l] KeyError: 'hinglish'