open-connectome-classes / StatConn-Spring-2015-Info

introductory material
18 stars 4 forks source link

Clusters based on proximity #75

Open DSP137 opened 9 years ago

DSP137 commented 9 years ago

We mentioned in class that there is a way to create clusters based on distances (proximity of vertices), and kmeans seemed to like clustering based on distances in the sense that all nodes with the same numbe of edges were clustered together. In brain networks, we mentioned grouping based on regions in the brain. How would we implement a method (or what command/program would we use) to cluster vertices based on physical proximity, rather than number of connections?

mrjiaruiwang commented 9 years ago

We could make a new "field" within each vertex to represent distance, then cluster by the square difference. Either everything in a certain "neighborhood" of radius epsilon away or group similar distances to other nodes together.

mblohr commented 9 years ago

You could use K-means, EM, or any other clustering algorithm, and the input would be rows of points in space ([x y] for 2D or [x y z] for 3D space) corresponding to each vertex, instead of rows of an adjacency matrix (as in HW1).

ajulian3 commented 9 years ago

Would it be helpful in this case to use the DBSCAN algorithm? From my experience that has been very informative in clustering spatial data.

michaelseung commented 9 years ago

DBSCAN algorithm is one of the most commonly used clustering algorithms. It's indeed very useful as it clusters based on point densities. The only thing one should watch for is that it doesn't cluster well when the data has large disparities in the densities.

whock commented 9 years ago

This would definitely work but in neurological data there's an added complication that sometimes clusters (nuclei, etc) exist mostly along one dimension. So for example the hippocampus is longer than it is wide. And other regions might be wider than long, or whatever. So creating a default "field" that's of uniform shape would miss some of the structural variety of nuclei in the brain. Maybe you could have a toolbox of a few default fields like sphere, cylinder, etc and apply them as needed based on what we know about the brain?