Closed abcosta closed 3 years ago
Those might be low MAF SNPs?
I have filtered for MAF, but I think I'll increase the threshold that was recommended to use and check if it helps
It is difficult to answer to your question. If you can give us access to your data, I can run scripts and check what is going on.
Hi,
Thank you! I'm sending you the data file.
Best, Ana
Has this been fixed?
Hi,
I'm currently working with a VCF file containing 24 individuals and 43,114 SNPs (no missing data and filtered for LD). I'm trying to check for outliers in my vcf file using both PCAdapt and Outflank, however I have been facing some issues in both methods. When I run the PCAdapt script I receive the following message after "plot(res,option="stat.distribution")":
Warning message: Removed 66 rows containing non-finite values (stat_bin)
If I continue with the script, for the "plot(-log10(res$pvalues))" I obtain a plot with -log10(res$pvalues) ranging from 0 to 40 (with most of the points below 10). And if I run "outliers <- which(qual < alpha)" it returns thousands of loci as outliers.
Interestingly when I ran Outflank for the same data set it indicated the presence of 66 outliers (same number as in the warning message above - would it be the same loci?). But when I ask to print the outliers it only lists NAs, so I cannot figure out which ones are the outliers.
Therefore, I was wondering if you could help me to identify the presence of outliers in my data set. I am not sure if it is a problem in my VCF file or if there is something wrong with the script I am following.
Thank you, Ana