Closed emaadmanzoor closed 6 years ago
For prediction that is correct. The decision function is a little bit different because it also uses the alpha prior to smooth the decision function a bit. You can see the code that does the same in C++ in LDA.cpp line 195.
The differences are highlighted in the following numpy pseudocode:
# your code
scores = x_test.T.dot(eta)
# smoothed version
x_test = x_test - alpha.T # shapes could be wrong but you get the idea
x_test /= x_test.sum(axis=1, keepdims=True)
scores = x_test.T.dot(eta)
Since alpha is the same for every document (and every topic in lda++) the predictions won't change but the scores will be smoothed a bit.
You could try both and you could also try training on the transformed training data from scratch using logistic regression or SVM which might get a bit more performance since you can use regularization and other techniques to improve upon plain logistic regression.
This works, thanks!
Hi, thank you for the nice code and documentation! I was able to install and train the fslda model on my dataset.
I would like to obtain predictions and construct an ROC curve using the trained model. I believe the following method is correct, and was wondering if you could double-check?