-
Ok - so what I've been working on is a way to check the accuracy of the current rspamd rules, symbols and plugins against a corpus of hand sorted spam and non-spam.
This would help when developing ne…
-
The Programming Historian has received the following tutorial on 'Analyzing Documents with Tf-idf' by @mjlavin80 . This lesson is now under review and can be read at:
http://programminghistorian.gi…
-
In reviewing performance profiling I'm finding that get_Elogbeta is taking some time.
I notice that there are 2 calls to this function, one after the other, LdaModel.do_mstep
The first call is o…
-
I'm trying to train a CNN model to compare with the fairseq-py default model. The training configuration is as follows-
-gpuid 1
-encoder_type cnn
-decoder_type cnn
-enc_layers 4
-dec_layers…
-
Related to #4.
-
![1](https://user-images.githubusercontent.com/16164105/49235348-39174500-f3fa-11e8-84a7-f3fb6707b3a7.png)
-
I recently used `ocrmypdf` to mass-OCR my PDFs and a bunch of DjVu files I converted to PDF (which strips the original Tesseract OCR so I needed some way to restore it). Worked very nicely, and I like…
gwern updated
5 years ago
-
In https://github.com/quanteda/quanteda/issues/1568#issuecomment-457757320 @koheiw proposes splitting `textstat_readability()` and `textstat_lexdiv()` into a separate package or packages. This creates…
-
What is the used ANNIS version?
3.4.2
What browser and operating system did you use?
Firefox 45, Ubuntu 15.0
What steps will reproduce the problem?
1. Search for a complex query with a lot of domina…
-
I'm not sure where would be a good point to discuss this, but I think some of the OpenML100 are too easy.
Why? I looked at the first *ten* run on each of the tasks and checked the best accuracy and A…