-
This task consists in using TreeTager to normalize the text being sent to the Annotator and therefore also use it to normalize the content of the dictionary.
This task is divided into 3 specific iss…
-
Re https://github.com/jwzimmer/aboutvsof/issues/1#issuecomment-758165911, there may be a pretty simple way to tell conspiracy theory speech (I am not sure what to even call this... I mean "nonsense"?)…
-
Most of the time the naming is like the following:
olegs updated
4 years ago
-
```
library(udpipe)
library(igraph)
library(ggraph)
library(ggplot2)
plot_annotation
-
Each of the different exploratory directions are using a lot of manual regex-based data cleaning.
We should consolidate that code into a utils file and use consistent prepossessing across all our dif…
-
When (e.g.) **Strong's numbers** are enabled in module settings, the numbers are closer to the next line than to the line they actually belong to.
This is misleading to some extent.
Enabling **D…
-
We need to train treeler in a way that matches the way Freeling emits the tokens, in terms of tokenization, lemmatization, and POS tagging. Otherwise we will train for something that will never appe…
-
### Problem
Whilst the original language checker is absolutely brilliant, it fails at small ciphertexts, or those with high entropy. An AI solution would be cool, but would be a bit OTT for rigid dat…
-
From what I can tell, the form object in Scrubber is not submitted at all, so it is just running defaults. This is obvious if you uncheck "Make Lowercase". It also explains issue #1011 and issue #1005…
-
## How to reproduce the behaviour
```python
import spacy
from spacy.matcher import Matcher
nlp = spacy.load('en_core_web_sm')
matcher = Matcher(nlp.vocab)
pattern = [{"LEMMA": "learn"}]
mat…