bcm-uga / pcadapt

Performing highly efficient genome scans for local adaptation with R package pcadapt v4
https://bcm-uga.github.io/pcadapt
37 stars 10 forks source link

SNP significance plot looks strange #73

Closed NickJeff13 closed 2 years ago

NickJeff13 commented 2 years ago

MicrosoftTeams-image (4) Hello, I am trying PCAdapt with pooled whole genome data for 23 populations/pools, and about 500k SNPs. I converted a sync file from Popoolation2 into pooldata format for poolfstat, and then calculated allele frequencies from there for input as a pcadapt pool object. This matrix is 23 rows by about 500,000 markers.

While the PCA I plotted makes sense (i.e. the pools cluster together as I think they should based on geography), the log p-values plot (attached) looks strange. Does pcadapt organize SNPs by significance at any step? I am not sure why my p-value vs SNP number plot looks like this, where all the highly significant SNPs are located at x=0.0.

Thanks for any advice on this.

privefl commented 2 years ago

I guess the peak is too high to see any other one. Try to add + ggplot2::scale_y_log10().