whitlock / OutFLANK

A procedure to find Fst outliers based on an inferred distribution of neutral Fst
18 stars 9 forks source link

questions about results #12

Open HuiqinYi opened 6 years ago

HuiqinYi commented 6 years ago

Dear Micheal and Katie, I am trying to run OutFlank on my SNP dataset. There is no error messages during the run, but I don't get any outlier from my results.Here is my script and attached is my SNP dataset and plot. `> library(OutFLANK)

o <- read.table("SNP.txt") dim(o) [1] 23085 134 o <- t(o) dim(o) [1] 134 23085 p <- read.csv("pop.csv") head(p) X pop 1 1 GXHZ01 2 2 GXHZ01 3 3 GXHZ01 4 4 GXHZ01 5 5 GXHZ01 6 6 GXHZ01 out <- MakeDiploidFSTMat(SNPmat = o,locusNames = seq(1,23085,by=1),popNames = p$pop) Calculating FSTs, may take a few minutes... [1] "10000 done of 23085" [1] "20000 done of 23085" OF_out <- OutFLANK(FstDataFrame = out,LeftTrimFraction = 0.05,RightTrimFraction = 0.05,Hmin = 0.1,NumberOfSamples = 14,qthreshold = 0.01) ERROR: The largest FST in the trimmed set must be < 1. Please use a larger RightTrimFraction. OF_out <- OutFLANK(FstDataFrame = out,LeftTrimFraction = 0.05,RightTrimFraction = 0.2,Hmin = 0.1,NumberOfSamples = 14,qthreshold = 0.01) OutFLANKResultsPlotter(OF_out,withOutliers = T,NoCorr = T,Hmin = 0.1,binwidth = 0.005,titletext = "outliers") outliers <- which(OF_out$results$OutlierFlag=="TURE") print(outliers) integer(0) ` From the FST calculated by the program, there are a lot of 1, I wonder if OutFlank does not suit for highly differentiated populations. Or probably there are anything wrong with my script or data preparation, but I can not figure out what it is. I will be really thankful if you could help me with my problem. Hope to get your reply in your advance. Huiqin Yih SNP.txt [Uploading pop.txt…]() [Uploading Rplot01.pdf…]()

HuiqinYi commented 6 years ago

pop.txt Rplot01.pdf

abcosta commented 6 years ago

Hi,

Did you receive an answer for your question? I'm having exactly the same problem. My results are quite similar and I'm comparing data sets that are potentially different species. Thus, I am also wondering if there is something wrong with my data or if it means there is no outliers.

rplot_outflank

DrK-Lo commented 6 years ago

Hello, OutFLANK assumes that the FST data are chi-squared distributed. It appears from looking at your FST plot that the data are not chi-squared distributed. Sometimes when studying closely related species there may be admixture, and OutFLANK has been shown to have low performance in this scenario (see https://doi.org/10.1111/1755-0998.12592 https://doi.org/10.1111/1755-0998.12592).