-
```python
import rutokenizer
#Traceback (most recent call last):
#
# File "", line 1, in
# import rutokenizer
#ModuleNotFoundError: No module named 'rutokenizer'
import rupostagger
…
-
Hey, great repository.
I'd like to add Telugu support. If you have a framework I should follow to download Telugu wikipedia and train it, I'd love some instructions and get going
-
I am using news dataset from [kaggle](https://www.kaggle.com/snapcrack/all-the-news).
I am using a spark nlp pipeline to preprocess the data.
Link to the [notebook](https://colab.research.google.c…
-
Hello,
With new version, where you added possibility of make hierarchy, number of topics could strangely collapsed to few only topics.
Here an example. I create corpus with two type of documents…
-
As I am analyzing a large corpus, I concatenated all existing texts as suggested in the description (i.e. by concatenating all texts with two line breaks between them) and setting the parameter `token…
g3rfx updated
4 years ago
-
`classla.download("sl")` downloads five files, whose sizes are 2.1GB total. I've tried to `zip` the five files and got 1.4GB. To speed up the download, would you kindly prepare zip'd model?
-
## Feature description
Lemmatizer just takes NOUN, ADV, and VERB as the parameter. Can you please add ADVERB to it as well - https://github.com/explosion/spaCy/blob/develop/spacy/lemmatizer.py?
-
seems the word "DEV" triggers an error.
```#!/usr/bin/python3
import spacy
from spacy_lefff import LefffLemmatizer, POSTagger
nlp = spacy.load('fr_core_news_md')
pos = POSTagger()
french_lemmati…
-
I meet a problem when I try to follow your command.
python prep-text.py data/sample/month1 data/sample/month2 data/sample/month3 -o data --tfidf --norm
/home/jjc/anaconda3/lib/python3.7/site-packa…
-
I did not find a proper "man page" about the configuration of other languages. The standard configuration for Omikuji is given as
```
[omikuji-parabel-en]
name=Omikuji Parabel English
language=e…