tskit-dev / msprime

Simulate genealogical trees and genomic sequence data using population genetic models
GNU General Public License v3.0
173 stars 86 forks source link

Variable recombination maps e.g. per population #2017

Closed hyanwong closed 2 years ago

hyanwong commented 2 years ago

At the end of the chat with @davidrasm yesterday he brought up an interesting suggestion which was to be able to change the recombination map as well as the demography during a simulation.

It should be possible to hack this is we want to change the recombination map over time by stopping a simulation, changing the map, then restarting from the roots of the previous simulation. But I think it would require msprime changes to allow different populations to have different maps, which was one idea that @davidrasm put forward.

Even more difficult (and maybe logically meaningless for a backwards simulator?), but also nice would be to change the map on a lineage-specific basis. I dan't really know how to do this, or even quite what it means.

Pinging @benjeffery as he was in on the conversation too

hyanwong commented 2 years ago

(by the way, it may be simplest for @davidrasm to do this in SLiM for the time being, although I don't know if SLiM has a "record_full_arg" type option, which he was using)

grahamgower commented 2 years ago

Conceptually, its not hard to imagine having two distinct lineages with two distinct recombination maps, and also permitting those maps to change over time (using the approach you suggest). But what happens when those lineages exchange migrants?

hyanwong commented 2 years ago

If it's in populations, then the migrant takes the map of the population it is in. If it is for a "lineage", then I assume we mean a lineage at a particular point in the genome, or something? I haven't thought is through properly, TBH.

grahamgower commented 2 years ago

Yeah, operationally the simplest thing to do is to apply the recipient lineage's map when/after migration occurs. But does this make sense biologically? And are there (tractable) alternatives that makes more sense biologically? Or perhaps more importantly, is this simplest behaviour useful to users?

hyanwong commented 2 years ago

You'll have to ask @davidrasm but he was talking about cases where the recombination map had some sort of geographical correlation. It might be that was caused by environmental factors (in which case the population approach would be fine) , or it might be due to inherited genomic features, such as Prdm9 changes, in which case it's a lineage-specific thing,. and maybe not really something that you would expect to do in a neutral simulator (I think? - that's what I was alluding to above)

FWIW, his study organisms were plant pathogens (e.g. fungi).

jeromekelleher commented 2 years ago

I think there's an existing issue discussing this, good to link these together if we could.

hyanwong commented 2 years ago

Ah, I didn't realise, sorry. I think this is https://github.com/tskit-dev/msprime/issues/1095. Is it worth closing this issue and moving the conversation there?

jeromekelleher commented 2 years ago

Best close one other the other, I don't mind which

hyanwong commented 2 years ago

Closing this one.