Closed JBGruber closed 4 years ago
Fantastic! It will take me a few days to review this (grading now...) but could I make are request? This really belongs in quanteda.textmodels. We're going to keep the classifiers package for mostly internal usage. The SVM code for instance is already moved to quanteda.textmodels.
Ah, I was wondering about this since many functions have been moved there yet this repo is also still active. I don't mind making a new (identical) PR in quanteda.textmodels.
Thanks @JBGruber that would be great. We're keeping the keras stuff in here since it remains experimental. I'll soon add a note to this effect.
Ok, I copied the functions and tests over to quanteda.textmodels and created a PR there: https://github.com/quanteda/quanteda.textmodels/pull/25
This PR implements a logistic regression classifier for 2 and >2 classes as discussed in #14 and #20. I finally got the tests to work after I had some issues with the changes for
quanteda
2.0 the last time I tried. Now the tests work locally but Travis still seems to have the same problems as in #23.Below I put together a short demo of the functions. Let me know what you think and what needs to be changed.
two classes (binomial)
The coefficients are a
dgCMatrix
in this case, which I think makes sense given that normally most values will be 0.Maybe it would make sense to show in an example that the coefficients can be used to show which words are most important for the classifier.
more than two classes (multinomial)
For multinomial classification, coefficients appear side by side: