Computational-Content-Analysis-2018 / 19-Jan-1-General-purpose-computer-assisted-clustering-and-conceptualization

Grimmer, Justin and Gary King. 2011. “General purpose computer-assisted clustering and conceptualization.”PNAS (Feb. 3).
0 stars 1 forks source link

Checking Quality of Clustering/General Content Analysis #4

Open TimothyElder opened 6 years ago

TimothyElder commented 6 years ago

In this article the authors mention using human judges to ensure the quality of the clustering methods they propose. A sample of documents from within and across clusters are taken and then the judges evaluate how well the algorithm performed in grouping them by similarity/dissimilarity. Considering the volume of data that can be analyzed using computational methods, using such a a technique for quality control is impossible, and if human evaluation is the only means of ensuring that the computation is representing vast data with fidelity, it would seem to undermine the epistemic power of these methods. Do computer scientists or computational social scientists have an answer to such a worry?

sunnyjooey commented 6 years ago

That's a great point! I imagine that you use human validation (because no other method exists) to assess how "good" the algorithm is, then you apply it to another body of text for classification. Then you can be somewhat confident that the second classification is also "good". Or at least I think that's the gist of it.