Closed Erikacj closed 7 months ago
Thanks @Erikacj . Could you provide an example data set reproducing the error? I am suspicious that you are using dosage data with a function that works with fstat format data. You could also check the closed issues about boot.ppfst
.
Yes, @jgx65, you're correct. I was using dosage data - SNPs. What is the best way to find the population pairwise significance using dosage data? The boot.ppbetas function?
If you can assume your markers are independent, you could convert the dosage format to fstat format (by replacing 0 with 11, 1 with 12 and 2 with 22) and use boot.ppfst
on this new dataset. If you can't assume independence, hierfstat
does not have yet a specific function to carry out the test. I suggest you chop the genomes in blocks you assume are independent, calculate Fst for each of these blocks, and bootstrap them, as we do in Goudet & Weir 2023, Fig S4.
Hi Jerome. Thanks for the help. I was importing the vcf incorrectly before. It was reducing the data and made it look like dosage data. Now that I have multilocus genotypes coded correctly I can retrieve population pairwise Fst using pairwise.WCfst and test for population pairwise significance using boot.ppbetas. I just want to double check that I understand correctly, for some populations the output of boot.ppbetas is all negatives above and below diagonal, does this mean that these comparisons are insignificant?
On Mon, Apr 1, 2024 at 6:33 AM Jerome Goudet @.***> wrote:
If you can assume your markers are independent, you could convert the dosage format to fstat format (by replacing 0 with 11, 1 with 12 and 2 with 22) and use boot.ppfst on this new dataset. If you can't assume independence, hierfstat does not have yet a specific function to carry out the test. I suggest you chop the genomes in blocks you assume are independent, calculate Fst for each of these blocks, and bootstrap them, as we do in Goudet & Weir 2023, Fig S4 https://urldefense.com/v3/__https://journals.plos.org/plosgenetics/article?id=10.1371*journal.pgen.1010871__;Lw!!PvDODwlR4mBZyAb0!RqQ2rnzEOBZyU5tcb636zQ-PbVc7fykRZu4hX7AYWBoig2wfGQMv_ei8sON-um1D8IEkgzvMJNbOke2zseMTip1inw$ .
— Reply to this email directly, view it on GitHub https://urldefense.com/v3/__https://github.com/jgx65/hierfstat/issues/71*issuecomment-2030113288__;Iw!!PvDODwlR4mBZyAb0!RqQ2rnzEOBZyU5tcb636zQ-PbVc7fykRZu4hX7AYWBoig2wfGQMv_ei8sON-um1D8IEkgzvMJNbOke2zseMSiLPqaA$, or unsubscribe https://urldefense.com/v3/__https://github.com/notifications/unsubscribe-auth/AC64DDKOSLWZQZG4DIWDJ4LY3GD3XAVCNFSM6AAAAABFNZCRNSVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDAMZQGEYTGMRYHA__;!!PvDODwlR4mBZyAb0!RqQ2rnzEOBZyU5tcb636zQ-PbVc7fykRZu4hX7AYWBoig2wfGQMv_ei8sON-um1D8IEkgzvMJNbOke2zseMrrjuftA$ . You are receiving this because you were mentioned.Message ID: @.***>
If you are willing to assume random mating within populations and the loci are independent. You can also check if the results are consistent with hierfstat::boot.ppfst
(where random mating within populations is not assumed)
Gotcha. Thank you!
On Wed, Apr 3, 2024 at 6:14 AM Jerome Goudet @.***> wrote:
If you are willing to assume random mating within populations and the loci are independent. You can also check if the results are consistent with hierfstat::boot.ppfst (where random mating within populations is not assumed)
— Reply to this email directly, view it on GitHub https://urldefense.com/v3/__https://github.com/jgx65/hierfstat/issues/71*issuecomment-2035028927__;Iw!!PvDODwlR4mBZyAb0!RsuN5KuoGIviavdYqAdy2BqFqJi3I8r0RDxNuuyBx-l7l2DU49nQS_pk9Z-bWzPhAwRqAGC3_I9tD_2gGDkETwp8Vw$, or unsubscribe https://urldefense.com/v3/__https://github.com/notifications/unsubscribe-auth/AC64DDNVKJKMPSLDAOVVYA3Y3QTETAVCNFSM6AAAAABFNZCRNSVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDAMZVGAZDQOJSG4__;!!PvDODwlR4mBZyAb0!RsuN5KuoGIviavdYqAdy2BqFqJi3I8r0RDxNuuyBx-l7l2DU49nQS_pk9Z-bWzPhAwRqAGC3_I9tD_2gGDk2rhWWmA$ . You are receiving this because you were mentioned.Message ID: @.***>
I can run pairwise.WCfst on my dataset without issue but when I try to run:
boot <- boot.ppfst(data, nboot=100, quant=c(0.025,0.975))
I get the following error:
Error in apply(x, 2, fun2 <- function(x) sum(x > 0)) : dim(X) must have a positive length In addition: Warning messages: 1: In max(logical(0), na.rm = TRUE) : no non-missing arguments to max; returning -Inf 2: In max(logical(0), na.rm = TRUE) : no non-missing arguments to max; returning -Inf 3: In max(logical(0), na.rm = TRUE) : no non-missing arguments to max; returning -Inf 4: In min(logical(0), na.rm = TRUE) : no non-missing arguments to min; returning Inf 5: In max(logical(0), na.rm = TRUE) : no non-missing arguments to max; returning -Inf 6: In max(logical(0), na.rm = TRUE) : no non-missing arguments to max; returning -Inf 7: In names(data.al)[-c(1:2)] <- names(data)[-1] : number of items to replace is not a multiple of replacement length
This is what my dataset looks like:
'data.frame': 181 obs. of 344 variables: $ Strata: int 51 51 51 51 51 51 51 51 51 51 ... $ 1 : int 2 1 1 1 NA 1 1 2 1 1 ... $ 2 : int 1 2 1 1 NA 1 1 1 1 1 ... $ 3 : int 2 2 2 2 NA 2 2 2 2 2 ... $ 4 : int 1 2 1 1 NA 1 1 1 1 1 ... $ 5 : int 2 2 NA 2 2 1 1 2 1 2 ... $ 6 : int 1 1 NA 1 2 1 1 2 2 2 ... $ 7 : int 1 1 1 1 NA 1 2 1 1 1 ... $ 8 : int 1 1 1 1 NA 1 2 1 1 1 ... $ 9 : int 2 2 2 2 NA 2 2 2 2 2 ... $ 10 : int 2 1 2 2 NA 2 2 2 2 2 ... $ 11 : int 2 1 2 2 NA 2 2 2 2 2 ... $ 12 : int 2 1 2 2 NA 2 2 2 2 2 ... $ 13 : int 1 1 1 1 NA 2 2 1 2 1 ...
What am I doing wrong to not be able to get boot.ppfst to work?