Closed Philipp-Sc closed 1 year ago
Instead of predicting all topics at once (the sum of the predictions equal to 1) predict (binary) topic pairs e.g ["hot","cold"]
New technique performs better. A potential drawback is that a higher number of topics might increase the inference time and makes it take to long on CPU only systems.
Note: This will be great to reduce false positives, since the model has not yet seen many ham (and spam) data for governance proposals.
Note: consider reducing the ham dataset by filtering some of the rejected proposals with high votes against. To make sure not to train likely spam as ham.