Visualising haplotype sharing between different cohorts to infer adaptive gene flow between places or taxa is tricky, especially for large numbers of haplotypes.
Using a dendrogram via plot_haplotype_clustering() is doable and gives you a complete view of the haplotype structure, but it's hard to see what's happening when the number of haplotypes goes above ~1000.
Using a network via plot_haplotype_network() is better, but I'm not sure how well this scales yet to larger numbers of haplotypes.
A possible alternative would be to show something simpler, along the lines of the number of haplotypes shared between different predefined cohorts. A possible way to visualise this would be some kind of arc diagram, analogous to something like:
Also this could be efficient as we just need to compute identical haplotype sharing, no need to do the full pairwise distance calculation. So this can be done via hashing of haplotypes in roughly O(n).
Visualising haplotype sharing between different cohorts to infer adaptive gene flow between places or taxa is tricky, especially for large numbers of haplotypes.
Using a dendrogram via
plot_haplotype_clustering()
is doable and gives you a complete view of the haplotype structure, but it's hard to see what's happening when the number of haplotypes goes above ~1000.Using a network via
plot_haplotype_network()
is better, but I'm not sure how well this scales yet to larger numbers of haplotypes.A possible alternative would be to show something simpler, along the lines of the number of haplotypes shared between different predefined cohorts. A possible way to visualise this would be some kind of arc diagram, analogous to something like: