Open nirwan1265 opened 11 months ago
Hi nirwan, did you manage to fix it? Best, Rafael
Hi Rafael,
Sorry for the late reply. I was doing some changes with SNP calling protocol. After much filtering, using GATK protocol using different parameters for prior distribution and heterozygosity. I was able to get 30k, snps on average.
Since our population is backcrossed and selfed for 3-4 generations, we expect 86% homozygous ref, 3% heterozygous and 11% homozygous alternate. We have an average sequencing depth of 0.8 for our population. I used autotune on, seq depth for 0.8, and cross over value of 0.01 (calculated from previous data, about 3 crossovers per chromosome). and this is the result I got. This is one of the samples: GenotypePlot_SID1328.pdf
We do see only heterozygotes, but I think we can assume they are introgressed regions. I have a couple of questions: 1.) Is the number of alternate calls enough? 2.) can we take into account selfing generations in the backcross model? 3.) can we specify the expected introgression from teosintes so the program can use it as a prior?
Hi Nirwan,
your plot shows that there is a majority of reference genotype all along your chromosomes, except chromosome 10 which is moslty heterozyigous. But, chromosomes 9, 8 and 7 are composed uniquely from the reference genotype. Looking at your data set I wonder why you do not see that 11% of the homozygous alternate. From a visual inspection, it looks like there are not much position with homozygous alternate counts at all (allele frequency ratio rarely achieves -100). While the reference ratio is close to 100 quite often. I can think about two possible problems:
Have you tried the method we suggest in our README file? Could you check these possibilities and get back to me? Could you tell me which is the R value that RTIGER has estimated?
Regarding your question 2 and 3: We have not added these options to the model because we only use the data to estimate the crossovers. This means that given a good density of the data points and counts, information on the selfing and introgression shouldn't be necessary. Our EM algorithm should be able to estimate the right underlying genotype without any problems.
Thank you a lot, Rafael
EDIT: I understand the problem now. I have some selfing in the population which will cause some level of homozygosity for the alternate allele and hence this will not work. [Keeping this issue on for anyone else]
Hello, Data description: I have backcross population of BC2S4s. for test, i am using 3 samples and 3 chromosomes.
I am running the function using nstates=2,
i get the following error:
It did work for nstates=3 however,
Any help is appreciated.