rodekruis / social-media-listening

4 stars 1 forks source link

Measure the quality of automated classification #161

Closed jmargutt closed 3 months ago

jmargutt commented 4 months ago

Context We're currently trying to understand the SML version that is fit for scale. In the current version of SML, labels were generated manually. Upon discussion between product and technical, it is clear that the product will benefit from an automated definition of the output labels. The starting point will be the IFRC CEA Coding Framework (IFRC CF), to be complemented at a later stage with topic modeling.

Tasks

*since our "test set" is what we labeled using our current labels, we can only test the model using the part of the IFRC CF that overlaps with our current labels.

Next Steps

  1. What's a standard quality benchmark (minimum accuracy?) that should be the "definition of good" for our model+labels
  2. Validate the quality metrics using the full IFRC CF
jmargutt commented 4 months ago

Image

Image

@ibadyal here's the accuracy of classification for both our coding framework (510-Ukraine) and the IFRC one. Accuracy is defined as % of messages classified with the correct label. Results are shown as a function of the number of examples (manually labeled) that the model is allowed to learn from: with this information, we can decide how much manual work we (or the NS) will need to reach a certain level of accuracy.

Notes: