QCBSRworkshops / workshop09

Workshop 9 - Multivariate analyses
Other
8 stars 4 forks source link

The section on hierarchical clustering needs improvement #3

Open pedrohbraga opened 3 years ago

pedrohbraga commented 3 years ago

The section on hierarchical clustering needs a lot of improvement on its clarity and on its content.

The instruction of the clustering algorithms in this workshop would improve a lot if they included working examples that build step by step with the distance matrix calculated for a few organisms (e.g., four or five), with the first step showing the first clustering along with both the first branch length estimation and the first distance matrix update, followed by the second step, until the final one, where a dendrogram is displayed.

The same working example could be used to compare other types of linkages.

The explanation of these methods and their distinctions are also more easily depicted if the formulas are included in the slides.

In addition to this, this workshop only includes single-linkage clustering, complete-linkage clustering and Ward's criterion). Unweighted pair group method with arithmetic mean (UPGMA) are widely used in ecology and evolution and could be covered in this section.

An explanation of the decision on how many groups to keep should be added to this section.

A short explanation of the distance metrics and a few comparisons should also be provided.

Finally, an interactive exercise should be added to this section to help participants assimilate this content.

If possible, other visualization methods for the dendrograms could be added, e.g. ggdendro::ggdendrogram() or dendextend.