Open peterinnes opened 5 years ago
Did you ever figure this out? I am now running into this issue myself.
No, I haven't figured it out. Kinda gave up on it for the time being. I'll write here otherwise.
I'm also running into this issue. I'll post if I figure it out, or I would be interested in anyone has came up with a solution!
I'm running outFLANK and got the same issue. I've tried with the function 'gl.outflank' and with 'WC_FST_Diploids_2Alleles', and both resulted in the same problem. Don't know what else to do...
Hi guys,
I've had this same error. I'm not sure why, but when I filtered my SNP data more stringently it resolved the issue.
Hope this helps! Cheers
I'm facing this issue, too! @nek001 what were the new filtering settings that you applied to your SNP data?
I think I solved it for my dataset by having more stringent missingness filtering.
This is especially so if you have uneven numbers of individuals per population, e.g. I have two populations with 10 individuals in one pop and 28 in the other. If there are genotypes which are totally missing in one pop, for example in the one with 10 individuals, it is not possible to calculate FST for that locus and this error is thrown. To solve it, make sure your missingness filter (e.g. bcftools filter -r 'F_MISSING> value') is set high enough so that there is no SNP where genotypes for all members of each population are missing, in my case minimum missingness of 10/38*100% = 0.27.
I hope this helps!!
Thanks Le!
I ended up filtering missingness by population too (what parameters exactly, I can't remember).
However, I was just wondering about the implications of this - by filtering out SNPs that are missing in one population but are possibly fixed or high frequency in another could potentially be removing SNPs under selection or would these likely be SNPs associated with demographic processes?
Hey nek001,
Great question! I think they could certainly be due to either selection or demographic processes, but maybe outflank wouldn't be the best way to distinguish between the two... If you had linkage data, you might be able to detect selective sweeps at a locus that was under strong selection, whereas sites fixed due to demography may be more likely at recombination hotspots.
Has anyone solved this?
Hi paulocecco,
Try removing SNPs that are absent from one population (when comparing two populations). I did managed to get it working by doing that, however, you may need to consider the consequences of this. I actually ended up using a different method for my research in the end.
Plink filter suppose to do that but I'm gonna do it manually. If I may ask nek001, what did you do then?
I just compared my .map files beteen the populations, they share all the same markers. So it's not a problem of marker difference, any help here?
Hey everyone,
I also had the same error when I used gl.outflank.
Starting gl2gi
Processing genlight object with SNP data
Matrix converted.. Prepare genind object...
Completed: gl2gi
Calculating FSTs, may take a few minutes...
[1] "10000 done of 67280"
[1] "20000 done of 67280"
Error in if (s2 == 0) { : missing value where TRUE/FALSE needed
Here I solved this error by removing NA in population like @lqch did. But I used dartR tools to fix it. I try genind format at first time, but it didn't work. So I convert the format to genlight then filtered again. I
Here are what I did:
gl2 <- gi2gl(vcf.genind)
gl2 <- gl.filter.allna(gl2, by.pop = T)
outflnk = gl.outflank(gl2, qthreshold = 0.05, plot = FALSE)
Hope the scripts help those who meet the NA error! Thanks for good suggestions from everyone!
Hi! I found the error. As Peteriness mentioned, the error occurs in that line of code (Fst Diploids.R line 37). I've traced this issue, and the problem lies in the function getFSTs_diploids (Fst Diploids.R line 116). In lines 119 and 120, the code removes elements with no call, and in some cases, there is a full population without values. Consequently, the Sample_Mat parameter of the WC_FST_Diploids_2Alleles function ends up with only one population, causing the formula of s2 to divide by zero. This is why applying more stringent missingness filtering, as suggested by lqch, may sometimes resolve the problem. However, the ultimate solution is to filter by population, as recommended by nek001, until the code can handle this scenario. Thanks to everyone!
Dear Katie and Michael,
I'm hitting an error when using the
MakeDiploidFSTMat()
function and am hoping you can help me out. The error is:Calculating FSTs, may take a few minutes...
Error in if (s2 == 0) { : missing value where TRUE/FALSE needed
Based on this, the error seems to be coming from the
WC_FST_Diploids_2Alleles()
function, specifically the following lines:s2 = sum(sample_sizes*(p_freqs - p_ave)^2)/((n_pops-1)*n_ave)
if(s2==0){return(0); break}
In my case I think
s2
has a value of NA, thus the error "missing value where TRUE/FALSE needed". Do you think this is the case? If so, do you have any idea where this NA is coming from/how to fix this error?I've attached my data, locinames, and population list below.
Thanks, Peter
OutFLANK_data.txt.gz
OutFLANK_locinames.txt.gz
OutFLANK_Pop_list.txt