-
This week we met and discussed the good progress Shane is making with topic modeling, which will definitely be a helpful part of our analysis and seems to be not only progressing extremely well but al…
pjc52 updated
2 years ago
-
I see the original WikiConv paper says there were conversations in Chinese collected, are these available through ConvoKit?
-
When I attempt to convert the output from mp_corpus() with coded manifestos into a Quanteda object, the quasi-sentences are not separated into separate documents in the Quanteda corpus, as described i…
-
I am running a sentiment analysis on a large corpus of tweets in R. VADER successfully returned sentiment scores for all but five tweets, which returned 'ERROR' in the word scores field. Upon inspecti…
-
Hello,
I have been trying out the library and it works nicely. The main problems are related to the sentiment not always being correct and, more importantly, that the language guesser fails pretty …
-
I will add the previous year's corpora for author masking task. these corpora contain 205 problems in English for author obfuscation task from 2016.
besides that, I will add the author verification…
-
In several functions, we have doctests that are effectively info-dumping a large and complex dictionary. This might be fine for internal tests, but we should simplify the doctests for user readability…
-
I was running local experiment using fuzzers `aflplusplus` and benchmarks `curl_curl_fuzzer_http` and `bloaty_fuzz_target`
I pass the `make presubmit` after installing `qtbase-dev5` mentioned in this…
-
A new Chuvash grammar textbook is being prepared on the basis of a 3M+ word corpus and our morphological analysis. The author is asking for a composite output of modes chv-morph and chv-segment in whi…
-
This is not immediately obvious:
* They could be actual issues
* They could be false reports by the checker
* They could be caused by wrongly configured search paths for the checker
```
[INFO] …