Closed bellecarrell closed 5 years ago
prominent clusters: health, arts (photography, music, writing, etc.)
got examplar accounts for health and arts and updated html and generate hit script. will be getting sample later
400 sample csv in repo
kappa moving to another issue. closing
[x] analyze other_text results and add prominent clusters to HIT
[x] make 400 sample hit csv
[ ] kappa iaa
Another quantity we will want to report is a (better) measure of inter-annotator agreement, over all workers https://en.wikipedia.org/wiki/Fleiss%27_kappa
This is an implementation I modified in the past: https://gist.github.com/ShinNoNoir/4749548 It is more robust to things like high baseline accuracy Kappa statistics are nicer than just simple % agreement because they control for things like baseline distribution over classes.
You should look at Cohen's kappa first: https://en.wikipedia.org/wiki/Cohen%27s_kappa Fleiss' kappa is a generalization to more than 2 annotators
Yeah, if you could summarize what our recall of promotional users was, distribution over categories, that would be good