Manning, Christopher, Prabhakar Raghavan and Hinrich Schütze. 2008. “Flat Clustering” and “Hierarchical Clustering.” Chapters 16 and 17 from Introduction to Information Retrieval.
To be able to do clustering, we need a distance measure.
In last week's assignment, we were introduced to several distance measures. However, given all the options in those times the different parameters for clustering (for example, number of clusters for some algorithms), we have a matrix of options.
How do researchers commonly decide on this? Do they first settle on a distance measure? Do they generate the matrix of results and then choose the one that they feel provides better clusters?
To be able to do clustering, we need a distance measure. In last week's assignment, we were introduced to several distance measures. However, given all the options in those times the different parameters for clustering (for example, number of clusters for some algorithms), we have a matrix of options. How do researchers commonly decide on this? Do they first settle on a distance measure? Do they generate the matrix of results and then choose the one that they feel provides better clusters?