Closed Amelia-uoft closed 7 years ago
What mutation operator do you use?
Sorry. I am not sure what is mutation operator. But my code to simulate the data is as below: python srv.py --gui=batch --selDist=gamma3 --regions=chr1:1..100000 --popFile=s_g3.pop --markerFile=s_g3.map --mutantFile=s_g3.mut
I see, are you using srv
from here? This script by default uses a diallelic mutation model which should yield even mutations across the region. Anyway,
By default a finite sited mutation model is used in which a mutant could be mutated back to wildtype allele. However, what you observed could potentially be explained by the alternative infinite-site model where repeated mutations would be ignored, thus causing lower observed mutation rates at smaller regions.
How do you measure mutation rate? Do you simply count number of mutations in the mut file? In this case the mutations are final result of the evolutionary process, not the mutations that have been introduced to the population during evolution. According to this line of code, if you use option --verbose 2
, you could get a mutations.lst
file with all the mutations.
My 100kb region was simulated 2years ago, 50kb and 300kb were simulated in Nov 2016. Do you think this problem is because of the changes of "srv" that has been made in the last two years?
After obtaining my simulated data, I only keep variants with MAF less than 1% and greater than 0.05%. Also, I coded each locus as 0 or 1, 0 means two major alleles, 1 means at least one minor allele (we call this as one mutation). The mutation rate is calculated based on each gene, which is number of mutations divided by number of sites in the region.
I am not sure whether our observed difference in the mutation rate is related to your model setting or not.
I can only say that in theory the mutation rates should not be correlated with length of regions. If you really would like to resolve this, you can
Use --verbose 2
to re-run the simulation to get the raw mutations from mutations.lst
. This will let us know if the mutations are generated correctly. You could use finite and infinite-site models to see if there is a difference.
If mutations are generated correctly and you still observe the differences that you described. You can try to disable natural selection and see if gene length and/or genetic drift is causing the observation, and then restore natural selection. You might find something that can explain the results.
Thanks for your suggestions!
Hi, I have a question about the mutation rate. I used simuPOP to simulate rare variants on 50kb, 100kb, and 300kb region separately. From our simulation results, we found that the mutation rate (probability to observe minor allele at single locus) increases when the size of the region increases. That is to say, 50kb region has lower mutation rate than 100kb region, and 100kb region has lower mutation rate than 300kb region. I am wondering whether such difference is designed/implied by the simulation model used in simuPOP. Thank you!