twitter / communitynotes

Documentation and source code powering Twitter's Community Notes
https://twitter.github.io/communitynotes
Apache License 2.0
1.42k stars 196 forks source link

Add tag-consensus harassment/abuse tag note scoring + reputation filtering #162

Closed jbaxter closed 10 months ago

jbaxter commented 10 months ago

-After the 1st phase matrix factorization, compute a new matrix factorization with the harassment/abuse tag as the label, instead of the overall helpful rating like normal. -Add a large rater reputation penalty on raters who have rated any notes with extremely high harassment/abuse scores as helpful -Support BCEWithLogits loss and pos_weight in order to train the imbalanced binary matrix factorization well