Closed kscottz closed 11 years ago
I am familiar with K-means algorithm and thought of implementing it using scikit-learn package.
Check if i am correct in the following points :
Can you please explain this stuff a bit more? Thank you
So hierarchal cluster is just like k-means but the algorithm finds k automagically. For color use avgColor, for shape use seven or so Hu Moments, for position just use x,y. Basically avgColor, Hu Moments, and position are all feature vectors so you may want to have an abstraction for what metric you use for clustering. So for example lets say you pull out a the following for positions:
0,0 10,10
10, 100 0, 100
100, 10 100, 0
100, 100 90, 90
What I would like to see is the blob featureset broken into four smaller feature sets, grouped by the position. Does this help clarify?
Yes, i got it. I will solve it for Kmeans and later try to do Hierarchical clustering.
Do you think the stuff I used for keypoint clustering can be used for this?
I peaked at this stuff the other day. It is pretty good! I still want to do a bit of refactoring (mainly to use the ROI class and do some outlier pruning). The hierarchical clustering is good, but I would like to see a bit more flexibility in how the dendrogram gets cut up. To that end we should allow the user to select the distance metric programatically.
Create a featureset function for blobs that allows us to cluster blobs based on their position, color, shape, or a feature extractor. The method should allow for both hierarchical clustering or k-means. The results should be returned as a list of featuresets.