mlampros / ClusterR

Gaussian mixture models, k-means, mini-batch-kmeans and k-medoids clustering
https://mlampros.github.io/ClusterR/
84 stars 29 forks source link

What does the parameter “seed” do in function Cluster_Medoids? #32

Closed A-Pai closed 2 years ago

A-Pai commented 2 years ago

What does the parameter “seed” do in function Cluster_Medoids?It is a fixed algorithm. image

mlampros commented 2 years ago

@A-Pai, the seed parameter allows reproducibility of the results. For instance,


require(ClusterR)
data(dietary_survey_IBS)
dat = dietary_survey_IBS[, -ncol(dietary_survey_IBS)]
dat = center_scale(dat)
cm = Cluster_Medoids(dat, clusters = 3, distance_metric = 'euclidean', swap_phase = TRUE, seed = 1)
# str(cm)
cm1 = Cluster_Medoids(dat, clusters = 3, distance_metric = 'euclidean', swap_phase = TRUE, seed = 1)
# str(cm1)
cm2 = Cluster_Medoids(dat, clusters = 3, distance_metric = 'euclidean', swap_phase = TRUE, seed = 2)
# str(cm2)

identical(cm, cm1)
# TRUE
identical(cm, cm2)
# FALSE

The documentation also mentions seed: integer value for random number generator (RNG)