khunreus / cluster-categorical

14 stars 6 forks source link

Hierarchical clustering on categorical data

Originally done for the purpose of identifying customer groups with distinctive behavior. You might find it useful as one of the approaches to analyze survey results with Likert scale (and other types of categorical data).

The process behind the code is described in this blogpost.

In brief, I followed the process below:

  1. generating some dummy customer data;
  2. calculating Gower distances for the dissimilarity matrix;
  3. having a look at agglomerative and divisive clustering;
  4. assessing clusters;
  5. visualizing cluster data: suggesting some simple solutions to visualize 6 variables and 7 clusters.