whitlock / OutFLANK

A procedure to find Fst outliers based on an inferred distribution of neutral Fst
18 stars 9 forks source link

Weird binning in Heterozygosity values #33

Open jayyeam opened 1 year ago

jayyeam commented 1 year ago

Hello!

I am working on a dataset that has 70 samples and ~540,000 snps grouped into two different populations. I was working through the tutorial and created a plot looking at outlier$results$FST and outlier$results$He. However, of my 540,000 snps, I only find 131 unique He values. I have followed the tutorial precisely, and haven't made any adjustments, so I am curious whether this is something to be expected. Here is the code:

outlier <- OutFLANK(FstDataFrame,NumberOfSamples = 70, RightTrimFraction = 0.06, LeftTrimFraction = 0.35, qthreshold = 0.05, Hmin = 0.1) OutFLANKResultsPlotter(outlier, withOutliers = TRUE, NoCorr = TRUE, Hmin = 0.1, binwidth = 0.005, Zoom = FALSE, RightZoomFraction = 0.05, titletext = NULL)

plot(outlier$results$He, outlier$results$FST, pch=20, col="grey") points(outlier$results$He[outlier$results$qvalues<0.01], y = outlier$results$FST[outlier$results$qvalues<0.01], pch=21, [col="blue")] image

I have also created a manhattan plot using the qvalues from OutFLANK, and it appears normal. So I am hoping to get some insights as to why the Heterozygosity looks so strange. image

Thank you! Jay

DrK-Lo commented 1 year ago

Hi Jay,

The "NumberOfSamples" in this case it should be 2 because you have 2 populations, but that doesn't effect He. The number of unique He values in principle is determined by the number of individuals in your dataset.


From: jayyeam @.> Sent: 11 April 2023 12:33 To: whitlock/OutFLANK @.> Cc: Subscribed @.***> Subject: [whitlock/OutFLANK] Weird binning in Heterozygosity values (Issue #33)

Hello!

I am working on a dataset that has 70 samples and ~540,000 snps grouped into two different populations. I was working through the tutorial and created a plot looking at outlier$results$FST and outlier$results$He. However, of my 540,000 snps, I only find 131 unique He values. I have followed the tutorial precisely, and haven't made any adjustments, so I am curious whether this is something to be expected. Here is the code:

outlier <- OutFLANK(FstDataFrame,NumberOfSamples = 70, RightTrimFraction = 0.06, LeftTrimFraction = 0.35, qthreshold = 0.05, Hmin = 0.1) OutFLANKResultsPlotter(outlier, withOutliers = TRUE, NoCorr = TRUE, Hmin = 0.1, binwidth = 0.005, Zoom = FALSE, RightZoomFraction = 0.05, titletext = NULL)

plot(outlier$results$He, outlier$results$FST, pch=20, col="grey") points(outlier$results$He[outlier$results$qvalues<0.01], y = outlier$results$FST[outlier$results$qvalues<0.01], pch=21, col="blue")

heterozygosityplot.pdfhttps://github.com/whitlock/OutFLANK/files/11202745/heterozygosityplot.pdf

I have also created a manhattan plot using the qvalues from OutFLANK, and it appears normal. So I am hoping to get some insights as to why the Heterozygosity looks so strange. manhattan_qvalues_outflank.pdfhttps://github.com/whitlock/OutFLANK/files/11202790/manhattan_qvalues_outflank.pdf

Thank you! Jay

— Reply to this email directly, view it on GitHubhttps://github.com/whitlock/OutFLANK/issues/33, or unsubscribehttps://github.com/notifications/unsubscribe-auth/ABUNI3OVQNJOG4GQ55JWJOLXAWBXJANCNFSM6AAAAAAW2QMI5Q. You are receiving this because you are subscribed to this thread.Message ID: @.***>