Closed cwhitman closed 6 years ago
So k-means has interesting limitations in cluster things that are distributed normally in n-dimensional space and along fixed axes.
Here's a primer for major clustering methods - I use this slide-deck / book in our data mining class.
Hierarchical clustering or CURE could be of interest (and they're basic enough that they should be readily implementable)
We decided to go with recurrent self organizing maps. There is still a lot of work/tinkering to do with those, but at the very least we have the main algorithm down.
Our current code uses the simple K-Means algorithm to classify the network data. It would likely be beneficial to use a more robust algorithm such as self-organizing maps or hierarchical clustering. We need to test these algorithms out and see how well they perform. We can start by testing them out on the FTP data we already have.