Hello, I am trying PCAdapt with pooled whole genome data for 23 populations/pools, and about 500k SNPs. I converted a sync file from Popoolation2 into pooldata format for poolfstat, and then calculated allele frequencies from there for input as a pcadapt pool object. This matrix is 23 rows by about 500,000 markers.
While the PCA I plotted makes sense (i.e. the pools cluster together as I think they should based on geography), the log p-values plot (attached) looks strange. Does pcadapt organize SNPs by significance at any step? I am not sure why my p-value vs SNP number plot looks like this, where all the highly significant SNPs are located at x=0.0.
Hello, I am trying PCAdapt with pooled whole genome data for 23 populations/pools, and about 500k SNPs. I converted a sync file from Popoolation2 into pooldata format for poolfstat, and then calculated allele frequencies from there for input as a pcadapt pool object. This matrix is 23 rows by about 500,000 markers.
While the PCA I plotted makes sense (i.e. the pools cluster together as I think they should based on geography), the log p-values plot (attached) looks strange. Does pcadapt organize SNPs by significance at any step? I am not sure why my p-value vs SNP number plot looks like this, where all the highly significant SNPs are located at x=0.0.
Thanks for any advice on this.