Open PaulaEB opened 1 year ago
Hey @PaulaEB,
Thanks for trying out {updog}
!
I haven't gotten around to allowing for multiparent populations yet. Some things you can look into:
If the answer is yes to both, then combining the different populations would not help much. Estimating the parent genotypes and those parameters is the benefit of using a larger sample size.
As for the missing data, if an individual has NA
listed, then it should provide NA
in the output. If it has 0
listed for the read-depth, then {updog}
will impute the genotype from the prior distribution (which is the best you can do if you aren't use information from other SNPs). E.g. consider:
library(updog)
refvec <- c(3, 4, 0, 8, 3)
sizevec <- c(10, 10, 0, 10, 10)
fout <- flexdog(refvec = refvec, sizevec = sizevec, ploidy = 4, )
fout$geno
plot(fout$postmat[3, ], fout$gene_dist)
abline(0, 1)
refvec <- c(3, 4, NA, 8, 3)
sizevec <- c(10, 10, NA, 10, 10)
fout <- flexdog(refvec = refvec, sizevec = sizevec, ploidy = 4, )
fout$geno
Best, David
Hello @dcgerard, many thanks for your clarification! I am going back to this data, but I would like to keep the missing (0) missing as GATK mark the missing values in DP as DP=0 (https://gatk.broadinstitute.org/hc/en-us/articles/6012243429531-GenotypeGVCFs-and-the-death-of-the-dot)
Is it possible to change that from updog or should I do that in the VCF with other tool?
Thanks again Paula
Yey @PaulaEB,
You can do that in R really easily.
E.g., suppose this is the matrix containing the read-depths:
sizemat <- matrix(c(0, 1, 2, 1,
1, 0, 1, 1,
1, 2, 1, 0), ncol = 4, byrow = TRUE)
Then we can convert those 0's to NA's via:
sizemat[sizemat == 0] <- NA
Cheers, David
Hello David, Thanks for developing updog! My project goal is identify QTLs for pest resistance, so we have a multiparenting population similar to a NAM pop (4 pollen recipients and a pollen donor) so we have four half-sib families. We are treating each family separated but I'd like to know your thoughts about if it's possible to do use all the population for the genotype calling.
And a last question would be about the missing data for de geno field. In the multidog$inddf output we don't see missing data, is this normal?
Thank you very much! Paula E