JoRussell-IDM / magude

magude cotransmission scenarios
HomePage
0 stars 0 forks source link

Presentation to outside groups #18

Open JoRussell-IDM opened 3 years ago

JoRussell-IDM commented 3 years ago

Three pipelines GenEpi+DTK

  1. Layered modular (CoTransmission): Scenario and Scientific Question (What's the isolated contribution of stochastic genotype seed replicates to the variance in genetic feature, allows for some sensible discussion of how broad changes in initial conditions affect results) True IBD and Pairwise IBS with some discussion of how certain number of sites under certain conditions of diversity fare in poorly in tools like hmmIBD.
  1. FPG alone: True pairwise IBD - What to show before sharing externally? Mapping sims onto ref epi data from Garki, but do these map onto realistic representations/relationships? Pairwise IBD a function of EIR?

    • 3 dfs: indexed: numpy arrays: timestamp, individual id, infection id, indexed: genotype values indexed: allele roots
  2. Hybrid: Separate evaluation of neutral sites on a tree of breakpoints. Drug selection sweep for exploring independence of unlinked neutral sites from drug resistance markers on small effective population sizes in the simulation.

Questions of interest to collabs:

  1. Working towards combining genotype and phenotype with relevance to modeling emergence and spread of drug resistance (of interest to Malaria PST, Senegal at large group)
  2. Highest level distinction tree-representation to amplicon representation: IBD in polyclonal infections : is there a threshold in the value of information contributed by 2, or 2+ infections within polyclonal infections, Senegal is more interested in classification of 2 or 2+, UCSF is more interested in the statistical metric for collapsing complex infections.
JoRussell-IDM commented 3 years ago

Objective: To overview the model structures employed in different GenEpi approaches and the scientific questions best suited to ask with each.

Albert Lee, Jessica Ribado

Glossary:

EMOD: an agent based model of malaria transmission

GenEpi: A model built from tskit to run technical replicates of genetic relatedness under configurable initial diversity on top of externally provided transmission trees with additional functionality to specify a sampling/observational model of the simulated genotypes.

FPG (Full Parasite Genetics): A model built as an extension of EMOD to be able to track parasite genotypes and their recombination within a consistent epidemiological model of malaria transmission with interventions and spatial resolution.

Model approaches.

  1. EMOD+GenEpi Cotransmission model: A layered model that generates EMOD Transmission Reports (human-vector-human connectivity of infection objects over time) which are subsequently used in GenEpi to simulate vector side recombination and strain level population genetic behavior using the transmission record to include the structure provided by co-transmission events and thus constrain human host connectivity, relatedness, and diversity over time.

Layers:

stochastic seeds of EMOD (technical replicates of transmission network) stochastic replicates of recombination and oocyst numbers in the vector root initialization of neutral markers (barcode SNPs) observation model (samples per human, sites per parasite genome) question for Albert: does genepi require subsampling of simulated genotypes only for post-processing of genetic features? Uses: Most flexible model for testing the independent contribution of model components to genetic outcomes: to test the independent contribution of modeled processes to variance in genetic signals: i.e. How sensitive is the genetic feature Pairwise-IBD to stochastic replicates of the crossovers drawn as part of the recombination model?

Drawbacks: No relevant representation/connectivity of genotype to phenotype (i.e, immunity, drug resistance or HRP2/3 deletions), no gametocyte density dependence for onward transmission, within model responsiveness to genetic signals (interventions in response to surveillance)

  1. FPG alone model: Everything's baked in (making genetic coinflips alongside epi coinflips), serialization can help in determining random, immunity and specific demography and intervention histories would behave in calibrated sites. This depends on representation of phenotypic markers on our simulated genomes. For var_gene_randomness_types:

ALL_RANDOM: in this case if recombination-specific prng is orthogonal to the remaining prng for DTK-sim (default is whole sim, can be configured for each node which makes it n_cores-invariant)

in absence of drug resistance or any phenotypic tracking this model type (IF FASTER than genepi) could be a useful basis for testing independent contribution of transmission replicates and genetic recombination replicates and observation model

FIXED_MSP or FIXED_NEIGHBORHOOD: genetic decisions inherently impact phenotypic action in the modeled infections and cannot be isolated from epidemiological outcomes, however! If computationally efficient we can run many stochastic replicates of this joint process.

Uses: Intervention impact and

Drawbacks: N

  1. FPG hybrid: passing of breakpoints to GenEpi. Independence for running efficient replicates for testing Is there anything about the neutral markers signals that is independent of signals we might see in drug resistance.

Uses: (Selective sweep example)

Drawbacks: Ensure that we have consistency in the mathematical representation of crossover and hepatocyte modeled processes to be able to interpret the output of interacting model layers.

JoRussell-IDM commented 3 years ago

Done see presentations here: C:\Users\jorussell\Dropbox (IDM)\Malaria Team Folder\projects\parasite_genetics\Presentations

https://www.dropbox.com/s/z0ql7ayt86bqqyp/Senegal_3_11_2021_DeepDive_v1.pptx?dl=0

https://www.dropbox.com/s/l1sja03sm06grnw/Senegal_2_25_2021_vFINAL.pptx?dl=0

https://www.dropbox.com/s/410ofwadrw7mdzj/Spapflagen_2_18_2021.pptx?dl=0