Run Existing t-CRF model over gathered Tweet data to assess which would be the best to label

scramblingbalam / Alta_Real

The python back end of a data product that aims to get at the truthiness of Trump's tweets. It uses PyStruct.EdgeFeatureGraphCRF to leverage the tree structure of twitter replies to gage veracity by the amount of support of denial that his tweets elicit.

MIT License

1 stars 0 forks source link

Run Existing t-CRF model over gathered Tweet data to assess which would be the best to label #44

Closed scramblingbalam closed 7 years ago

scramblingbalam commented 7 years ago

Run the model over the collected tweet threads to assess which might contain the largest number of deny or support tweets which are the primary interest of this research.

Note: Since the current model does a poor job of predicting "deny" class this may mean sub setting the data and retraining for this to be useful, or training another ML model. This isn't included in the estimate.

scramblingbalam commented 7 years ago

Ran Model with training data got:

                           precision       recall          f1              n              
supporting 0.618 0.491 0.547 841
denying 0.059 0.003 0.006 333
appeal-for-more-information 0.483 0.418 0.448 330
comment 0.733 0.876 0.798 2734
Macroevaluation 0.473 0.447 0.450 4238
Microevaluation 0.695 0.695 0.695 4238