-
Hi, I'm not sure exactly where things fail, but it seems something is not multiprocessing safe (see minimal reproducible example below). May well be that this is a known limitation, so I just wanted t…
-
Spacy 2.1.1 - 2.1.6
Python 3.6
Ubuntu 18.04.1 LTS
In Spacy since 2.1 randomly occurs memory errors from С/С++ like "free(): invalid next size (fast)" no any additional information are provided in…
-
#### Description
I am working on using a pipeline with combination of preprocessing module as Count Vectorizer, TFIDF and Algorithms (set of algorithms), although its working fine with the fo…
-
## How to reproduce the behavior
I am not sure it's a bug, but just wanted to let you know that the lemma of the following words: 'car', 'carriers', 'scar', (SIM) card,
is always car. Is it co…
oltip updated
5 years ago
-
I'm attempting to use the various UD_English corpora to test the performance of some open-source lemmatizers but I'm finding the data has a ton of errors in it. I notice this especially with ADJ type…
-
## How to reproduce the behaviour
```
>>> import spacy
>>> nlp = spacy.load('es')
>>> sentence = "sencillo como habrían deseado."
>>> doc = nlp(sentence)
>>> for token in doc:
... print(tok…
-
We should plot training corpus size vs devset accuracy and see if the curve already flattens out.
In order to use the 20k miles under the see corpus, we'd need to make the companion data ourselves…
-
```
(gdb) run /home/proycon/work/glem/glem/glem.py -f greek.txt
Starting program: /data2/dev/bin/python3 /home/proycon/work/glem/glem/glem.py -f greek.txt
[Thread debugging using libthread_db enab…
-
I am in the process of evaluating Manticore for migrating from Sphinx, but i am having trouble with getting my stuff indexed. The configuration is used with Sphinx since years without any issue, so ei…
-
Apesar do artigo apresentar que o SentiWordNet apenas possui definição para unigramas, ainda não temos nem essa etapa pronta. Precisaríamos ter isto pronto para poder evoluir usando a estratégia com b…