tskit-dev / msprime

Simulate genealogical trees and genomic sequence data using population genetic models
GNU General Public License v3.0
173 stars 85 forks source link

SMC prime model no longer working with record_full_arg option in simulate #795

Closed ismael-mendoza closed 5 years ago

ismael-mendoza commented 5 years ago

Hello,

For an application I have been simulating tree sequences like:

ts = msprime.simulate(sample_size=3, model='smc_prime', Ne = 2, recombination_rate=1.0, random_seed=42, record_full_arg=True)

I recently updated to the most recent version of msprime (0.7.2) and now get a Library error when running the above command:

LibraryError: Current simulation configuration is not supported.

My application involved finding the hidden transitions (a transition that results in the current tree not changing) and pruning location of the trees in the smc prime model (this was possible before using the additional information when using the record_full_arg ). So with this new update I have some questions:

(1) Why was the record_full_arg option made incompatible with the smc_prime model? (2) Is there an easier/better way for me to get the hidden transition and pruning information in the smc_prime model now ?

Thanks in advance !

jeromekelleher commented 5 years ago

(1) Why was the record_full_arg option made incompatible with the smc_prime model?

I realised that there were some tricky corner cases with the SMC models in the full-ARG case, so the output of pre 0.7.2 might not have been exactly correct. It seemed easiest to just rule it out, since I didn't think anyone would be using this particular combination of features. Looks like I was wrong! I don't think it would be super hard to get this functionality in, but honestly I don't know when I could do it as I'm pretty snowed under at the moment.

(2) Is there an easier/better way for me to get the hidden transition and pruning information in the smc_prime model now ?

Possibly - can you explain a bit more what you're doing and what the desired output is?

ismael-mendoza commented 5 years ago

Looks like I was wrong! I don't think it would be super hard to get this functionality in, but honestly I don't know when I could do it as I'm pretty snowed under at the moment.

Thanks @jeromekelleher, it would be nice to have this functionality whenever you can work on it.

Possibly - can you explain a bit more what you're doing and what the desired output is?

Yes, so one of the desired outputs is to obtain the hidden transitions in a SMC' simulation, which would appear as identical adjacent trees. I basically just need to know, for each tree outputted by msprime, the number of identical adjacent trees that got squashed together. This is what I was trying to do here in this previous issue: https://github.com/tskit-dev/msprime/issues/691 and which was resolved when you added the record_full_arg option.

My second desired output was also featured in another previous issue https://github.com/tskit-dev/msprime/issues/709. I call it the pruning location, which is the location in the tree where a branch is pruned and then reconnected somewhere else in the tree to form a new transition. But in terms of the ARG, as @JereKoskela has pointed out, it would be the location where a branch splits due to a recombination event. Please see the example in the issue above and his related comment:

If we measure time from the leaves towards the root, the first recombination involves branch 1 splitting into branches 4 and 5.

Please let me know if that makes sense and thanks in advance!

jeromekelleher commented 5 years ago

I've just checked, and in fact there's no problem with SMC for the full ARG --- I was overgeneralising from other models. Sorry about this @ismael2395, and thanks very much for the bug report!

I've fixed this over in #796. I'll push out a quick bugfix release once it's merged, as it might be some time before the next feature release is pushed out.

jeromekelleher commented 5 years ago

0.7.3 has just been released @ismael2395 - you should be able to upgrade and use your code as before.

ismael-mendoza commented 5 years ago

Thanks @jeromekelleher this is great :) Just making sure, the problem should now be fixed for both the smc and smc_prime model right? Thanks!

jeromekelleher commented 5 years ago

Yep, smc' will work fine as well.