-
The [n-grams generation script](https://github.com/Tatoeba/Tatodetect/blob/master/tools/generate.py) is executed every week. It consumes about 1.5 GB of RAM lately. While this causes no serious harm, …
-
Post your screenshots and discuss your findings about cac.txt here!
-
-
a rules files will be a list of rule (ordered?) , loosy bnf grammar (need to review by language theory lessons...) (not that right now i don't precise how it's going to be written, xml, json, whatev…
-
nsaef updated
6 years ago
-
Stylometric analysis is well understood and shockingly powerful even using only simple features like bigrams and trigrams. I can't find the thread right now but there's demos on HN where even small sa…
-
Post your screenshots and discuss your findings about dc.txt here!
-
**Is your feature request related to a problem? Please describe.**
Certain types of documents, such as scientific publications, are often often accompanied by a list of keywords that typically contai…
-
-
I've been reading your notes and examples about Markov chains, I like them a lot and they have inspired me to build some things. Thank you for making them public! (And sorry for messing with your issu…
ghost updated
8 years ago