ramachandran-lab / pong

Fast analysis and visualization of latent clusters in population genetic data
66 stars 11 forks source link

Integrating boostrapping files from ADMIXTURE #4

Closed nehasavant closed 5 years ago

nehasavant commented 6 years ago

Hello!

I am using pong to visualize my ADMIXTURE Q-files, but I'm confused on one part. I used a bootstrapping method while running ADMXITURE (included the -B flag with 200 replicates) and for each K the output consisted of one Q file, one P file, one .se file and one .bias file. I don't have separate Q files for each "run" (i.e. I only have a single Q file for each k: k2r1, k3r1, k4r1) while the sample data set shows multiple Qs per k (i.e. k2r1, k2r2, k2r3, and k3r1, k3r2, k3r3, etc..)

Is there a different way you conducted "runs" in ADMIXTURE?

Would appreciate any and all guidance, thanks!

abehr commented 6 years ago

Hi Neha,

Great question! To generate multiple Q-files per K we actually run ADMIXTURE multiple times for each value of K. This is not strictly necessary -- you should still be able to use pong to visualize and analyze your existing data.

The reason it can be useful to generate replicate runs per K is to assess the robustness of the clusters inferred at each value of K. The stochasticity of ADMIXTURE's clustering approach means that replicate runs can produce distinct solutions even when the same initial conditions are used. These distinct solutions can result from real biological factors, and we refer to this concept as multimodality (Jakobsson and Rosenberg, 2007; Behr et al., 2016).

Hope this helps. Feel free to reach out if you have any other questions!