Open AlLee-IDM opened 3 years ago
[PLOT hist of oocyst density as drawn from GenEpi]
[PLOT serial intervals] JRR
[PLOT generation times] JRR
[PLOT of geometric mean of gam density by infection age?] JRR
[PLOT heterozygosity by site still at 0.1 rather than 0.5 in importation scenario?] JVR
[PLOT of number of nodes (parasite clones) over time] JVR
Discussion 7.1.2021
REAL MCCOIL assumes uniform distribution by site real samples (high transmission scenario?)
MAF counted across all individuals, not aggregated by sample? No highly variable?
Not replicating the right co-transmitted or superinfecting diversity for mixed infections
First validating MAF across population, but then seeing difference by sample would highlight where in the model we might be representing a different process. Running the uniform gametocyte apportionment
If COI and MAF are looking like real data but heterozygosity doesnt that really narrows it down.
Underestimating the polygenomic fraction that they got from barcode
Even with same positions, the fraction is underestimated because of too conservative threshold for het calls? Subject to low read depth?
Scenario with “constant” parasite population time
For eff pop size, you should get fixation rate at any point in time, something something calculus (Albert to chew on) :)
Continue literature search, look at Prin of PopGen (AL)
Metrics:
COI by person by month
MAF by population by year
MAF by site by year
Heterozygosity metrics:
Heterozygosity by infection event (e.g. for infections of COI >= 2, what proportion sites are heterozygous)
Heterozygosity by site/sample* averaged over all samples in each year (0 for all clonal infections, 1 for all het polyclonal infections)
site/sample meaning the average frequency of heterozygous allele at a site within a sample
Known issue with sensitivity of genotype callers to threshold: If ⅘ calls are dominant with threshold of .2 you will get a het call vs if you had just one more infection of ⅚ using same threshold you would not pick it up
Potential fix: normalization by multiplication by gam density (taking into account dominant infection from genepi to add to this calculation) this should move these closer to realism
Giving density by genotype in table output from GenEpi for use in Observation model (even just identifier of dominant infection)
Checklist updates:
Sjjd
Test
Is it easier to compile notes as issues, or as files in one of the repo directories?
Below we test how things look if we copy and paste a google doc into an issue comment: