bmschmidt / pubmed-explorer

Scrollership through 20m pubmed abstracts.
Other
25 stars 2 forks source link

Regional Clustering #40

Open bmschmidt opened 1 year ago

bmschmidt commented 1 year ago

Probably needs to be postponed after launch, but I spent a little time poking around at building a Delaunay triangulation and minimal spanning tree of 1,000,000 points from the dataset and then walking it from some random seeds to build a set of clusters. It works reasonably well, although could use a step where nearby clusters (i.e., those sharing many short edges on the delaunay triangulation) are agglomerated to each other.

One use here would be to generate labels/other characteristics for lobes.

image