tarahmarie / dh-trace

MIT License
0 stars 0 forks source link

Store coefficients from svms #45

Closed jdmartin closed 1 week ago

jdmartin commented 2 weeks ago

So, this is the very definition of a silly thing. I wouldn't bother merging it unless it really serves a need. It will make the databases larger.

Anyway, one of the things I realized we could save from the svm runs is the set of coefficients (tokens) and their relative weight within the model. So, I did. I modified explore-svm.py in a really rudimentary way so that they can be poked at. You can also ask it to output a csv with all of the coefs and values.

Again, super basic and probably useless. Just like me. ;)

tarahmarie commented 2 weeks ago

Looks like I have to rerun the begin script to get the new svm-coefficients db table; will report back once that's complete.

tarahmarie commented 2 weeks ago

Very cool to see the coefficients.

Screenshot 2024-07-03 at 12 39 46 (2)
tarahmarie commented 2 weeks ago

The coefficients are going to be interesting to poke through.

What I'm trying to figure out is how to get the SVM values associated with the correct text pairs such that I can swap out ngrams and svms...is it possible to start with the DB and just rename everything? I need to get to a point with a dataframe of a random set of text pairs from eltec-100 so I can start doing the logistic regression - and that's what the other PR is about. I think you've already generated all the relevant information and we are going to meet in the middle somewhere in the code.