This is the branch to substitute svms for ngrams. I've started from the end and i'm workign my way backwards. First, I've added to database_ops.py to cheat-add a table for the SVM data based on the totals/data in svm.db. So, now I'm moving backwards from begin.sh to confirm the SVM data exists. Currently stuck inside the totals table because it doesn't quite calculate the way the other totals do for sequence alignments and hapaxes, so it'll more be an "is this data present in the svm test set?" kind of query in the begin script. Once that's done, I'll move to editing the SQL query to get the SVM data into the hairball graph in the Jupyter notebook, and then regenerate the CytoScape info to test that it's working at any weight allowed. F1/weighting calculations are out of scope for this branch; that's the next one.
This is the branch to substitute svms for ngrams. I've started from the end and i'm workign my way backwards. First, I've added to database_ops.py to cheat-add a table for the SVM data based on the totals/data in svm.db. So, now I'm moving backwards from begin.sh to confirm the SVM data exists. Currently stuck inside the totals table because it doesn't quite calculate the way the other totals do for sequence alignments and hapaxes, so it'll more be an "is this data present in the svm test set?" kind of query in the begin script. Once that's done, I'll move to editing the SQL query to get the SVM data into the hairball graph in the Jupyter notebook, and then regenerate the CytoScape info to test that it's working at any weight allowed. F1/weighting calculations are out of scope for this branch; that's the next one.