MaCS error with population splitting/joining

gchen98 / macs

Automatically exported from code.google.com/p/macs

16 stars 6 forks source link

MaCS error with population splitting/joining #31

Open mpcox opened 9 years ago

mpcox commented 9 years ago

Hi Gary,

I chatted with you about this a while back, but I'm just following up to say that I think MaCS still isn't quite doing split/join events right.

I'm trying to model quite a complex situation with eight populations, but I've managed to isolate the problem down to the following simple example:

Imagine we have just two populations, 1 and 2. At a given time, we split population 1 into two new populations, which under ms rules should be called populations 1 and 3. Slightly later, we split population 2 into two new populations, which should be called populations 2 and 4. We then join these populations together: first 4 to 1, then 3 to 2, then 2 to 1.

There's nothing very complex in this setup. An example command line would be:

macs 120 1e8 -t 4.2e-05 -r 0.00042 -h 1e3 -I 2 60 60 10 -es 0.00388 1 0.741 -es 0.00391 2 0.741 -ej 0.00394 4 1 -ej 0.00395 3 2 -ej 0.00396 2 1 2>trees.txt | msformatter > test.out

However, this fails with the following error:

Graph Builder begin Attempting to add edge of population 3 when the number of available population edge pools is only 2. It is recommended that you increase the migration rates and/or number of sampled chromosomes. Graphbuilder destructor Simulator caught exception with message: Data structure integrity error. Simulator destructor: Configuration destructor Program is complete

In my tests, raising the migration rate and sample size doesn't help - even if I raise them to strange levels (e.g., setting the sample size larger than the population size). MaCS either throws the same error, or a different one:

Infinite coalescent time. No migration

This also doesn't make much sense to me, as all populations are quickly joined together into a single population, whose individuals should then rapidly coalesce.

Can you possibly take a look at this and see what's going on?

-Murray

gchen98 commented 9 years ago

Hi Murray,

I've pushed a patch onto Git. Do you want to do another Git pull on the latest source code? This should fix the bug with a bad memory allocation on the migration matrices.

Let me know how this works out. Also, I notice that sometimes with different simulation seeds it works. This is because you have small number of sampled chromosomes, and we run out of chromosomes to pick as we go up the ARG. If you don't want to increase the migration rates, you can maintain your demographic model by scaling up the number in -I and the total haplotypes.

Gary

On 6/30/15 7:30 PM, Murray Cox wrote:

Hi Gary,

I chatted with you about this a while back, but I'm just following up to say that I think MaCS still isn't quite doing split/join events right.

I'm trying to model quite a complex situation with eight populations, but I've managed to isolate the problem down to the following simple example:

Imagine we have just two populations, 1 and 2. At a given time, we split population 1 into two new populations, which under ms rules should be called populations 1 and 3. Slightly later, we split population 2 into two new populations, which should be called populations 2 and 4. We then join these populations together: first 4 to 1, then 3 to 2, then 2 to 1.

There's nothing very complex in this setup. An example command line would be:

macs 120 1e8 -t 4.2e-05 -r 0.00042 -h 1e3 -I 2 60 60 10 -es 0.00388 1 0.741 -es 0.00391 2 0.741 -ej 0.00394 4 1 -ej 0.00395 3 2 -ej 0.00396 2 1 2>trees.txt | msformatter > test.out

However, this fails with the following error:

Graph Builder begin Attempting to add edge of population 3 when the number of available population edge pools is only 2. It is recommended that you increase the migration rates and/or number of sampled chromosomes. Graphbuilder destructor Simulator caught exception with message: Data structure integrity error. Simulator destructor: Configuration destructor Program is complete

In my tests, raising the migration rate and sample size doesn't help - even if I raise them to strange levels (e.g., setting the sample size larger than the population size). MaCS either throws the same error, or a different one:

Infinite coalescent time. No migration

This also doesn't make much sense to me, as all populations are quickly joined together into a single population, whose individuals should then rapidly coalesce.

Can you possibly take a look at this and see what's going on?

-Murray

— Reply to this email directly or view it on GitHub https://github.com/gchen98/macs/issues/31.

abwolf commented 9 years ago

I've been having a related issue.

Here is an example command I am trying to run: macs ... -I 3 N1 N2 N3 0 -ej 0.25 2 1 -es 0.5 1 .25 -ej 0.75 3 1 -ej 0.9 4 1

I get a warning message: "INPUT: At time 0.9: WARNING: The pop IDs used in pop join is greater than the number specified in -I. You must have a split event before this join event. Population 4 will merge with 1"

Is it safe it ignore this warning, given I have specified an earlier split? I still seem to be getting some trees.

gchen98 commented 9 years ago

You can ignore those warnings.

On 07/01/2015 06:56 PM, abwolf wrote:

I've been having a related issue.

Here is an example command I am trying to run: macs ... -I 3 N1 N2 N3 0 -ej 0.25 2 1 -es 0.5 1 .25 -ej 0.75 3 1 -ej 0.9 4 1

I get 2 warning messages: 1) "INPUT: At time 0.9: WARNING: The pop IDs used in pop join is greater than the number specified in -I. You must have a split event before this join event. Population 4 will merge with 1"

Is it safe it ignore this warning, given I have specified an earlier split? I still seem to be getting some trees.

— Reply to this email directly or view it on GitHub https://github.com/gchen98/macs/issues/31#issuecomment-117874729.

mpcox commented 9 years ago

Hi Gary,

Thanks for getting on to this so quickly!

I'm at a two-day meeting and so haven't been able to test your code fix, but I'll put this right up my priority list. I'm really keen to get this working!

More feedback shortly...

-Murray

On 2/07/2015, at 9:43 am, gchen98 notifications@github.com wrote:

Hi Murray,

I've pushed a patch onto Git. Do you want to do another Git pull on the latest source code? This should fix the bug with a bad memory allocation on the migration matrices.

Let me know how this works out. Also, I notice that sometimes with different simulation seeds it works. This is because you have small number of sampled chromosomes, and we run out of chromosomes to pick as we go up the ARG. If you don't want to increase the migration rates, you can maintain your demographic model by scaling up the number in -I and the total haplotypes.

Gary

On 6/30/15 7:30 PM, Murray Cox wrote:

Hi Gary,

I chatted with you about this a while back, but I'm just following up to say that I think MaCS still isn't quite doing split/join events right.

I'm trying to model quite a complex situation with eight populations, but I've managed to isolate the problem down to the following simple example:

Imagine we have just two populations, 1 and 2. At a given time, we split population 1 into two new populations, which under ms rules should be called populations 1 and 3. Slightly later, we split population 2 into two new populations, which should be called populations 2 and 4. We then join these populations together: first 4 to 1, then 3 to 2, then 2 to 1.

There's nothing very complex in this setup. An example command line would be:

macs 120 1e8 -t 4.2e-05 -r 0.00042 -h 1e3 -I 2 60 60 10 -es 0.00388 1 0.741 -es 0.00391 2 0.741 -ej 0.00394 4 1 -ej 0.00395 3 2 -ej 0.00396 2 1 2>trees.txt | msformatter > test.out

However, this fails with the following error:

Graph Builder begin Attempting to add edge of population 3 when the number of available population edge pools is only 2. It is recommended that you increase the migration rates and/or number of sampled chromosomes. Graphbuilder destructor Simulator caught exception with message: Data structure integrity error. Simulator destructor: Configuration destructor Program is complete

In my tests, raising the migration rate and sample size doesn't help - even if I raise them to strange levels (e.g., setting the sample size larger than the population size). MaCS either throws the same error, or a different one:

Infinite coalescent time. No migration

This also doesn't make much sense to me, as all populations are quickly joined together into a single population, whose individuals should then rapidly coalesce.

Can you possibly take a look at this and see what's going on?

-Murray

— Reply to this email directly or view it on GitHub https://github.com/gchen98/macs/issues/31.

— Reply to this email directly or view it on GitHub https://github.com/gchen98/macs/issues/31#issuecomment-117832965.

gchen98 commented 9 years ago

Hi Murray,

Just for your info, yesterday was a fortunate day for you in terms of fixing the bug. It was the first day of my new job in industry, so I'm still getting acclimated and had a little time on my hands. My time will be more limited than before moving forward. As this is open source, it would be great if you or someone on your team can help share the burden and spot bugs. This could expedite turnaround.

Gary

On 7/2/15 12:01 AM, Murray Cox wrote:

Hi Gary,

Thanks for getting on to this so quickly!

I'm at a two-day meeting and so haven't been able to test your code fix, but I'll put this right up my priority list. I'm really keen to get this working!

More feedback shortly...

-Murray

On 2/07/2015, at 9:43 am, gchen98 notifications@github.com wrote:

Hi Murray,

I've pushed a patch onto Git. Do you want to do another Git pull on the latest source code? This should fix the bug with a bad memory allocation on the migration matrices.

Let me know how this works out. Also, I notice that sometimes with different simulation seeds it works. This is because you have small number of sampled chromosomes, and we run out of chromosomes to pick as we go up the ARG. If you don't want to increase the migration rates, you can maintain your demographic model by scaling up the number in -I and the total haplotypes.

Gary

On 6/30/15 7:30 PM, Murray Cox wrote:

Hi Gary,

I chatted with you about this a while back, but I'm just following up to say that I think MaCS still isn't quite doing split/join events right.

I'm trying to model quite a complex situation with eight populations, but I've managed to isolate the problem down to the following simple example:

Imagine we have just two populations, 1 and 2. At a given time, we split population 1 into two new populations, which under ms rules should be called populations 1 and 3. Slightly later, we split population 2 into two new populations, which should be called populations 2 and 4. We then join these populations together: first 4 to 1, then 3 to 2, then 2 to 1.

There's nothing very complex in this setup. An example command line would be:

macs 120 1e8 -t 4.2e-05 -r 0.00042 -h 1e3 -I 2 60 60 10 -es 0.00388 1 0.741 -es 0.00391 2 0.741 -ej 0.00394 4 1 -ej 0.00395 3 2 -ej 0.00396 2 1 2>trees.txt | msformatter > test.out

However, this fails with the following error:

Graph Builder begin Attempting to add edge of population 3 when the number of available population edge pools is only 2. It is recommended that you increase the migration rates and/or number of sampled chromosomes. Graphbuilder destructor Simulator caught exception with message: Data structure integrity error. Simulator destructor: Configuration destructor Program is complete

In my tests, raising the migration rate and sample size doesn't help - even if I raise them to strange levels (e.g., setting the sample size larger than the population size). MaCS either throws the same error, or a different one:

Infinite coalescent time. No migration

This also doesn't make much sense to me, as all populations are quickly joined together into a single population, whose individuals should then rapidly coalesce.

Can you possibly take a look at this and see what's going on?

-Murray

— Reply to this email directly or view it on GitHub https://github.com/gchen98/macs/issues/31.

— Reply to this email directly or view it on GitHub https://github.com/gchen98/macs/issues/31#issuecomment-117832965.

— Reply to this email directly or view it on GitHub https://github.com/gchen98/macs/issues/31#issuecomment-117936190.

mpcox commented 9 years ago

Hi Gary,

Congratulations on the new job! I didn't realize you were leaving USC.

Definitely happy to help with this. I've had a number of students move on recently, so as fair warning, the experience level of my group has dipped quite a bit. However, we are actively using MaCS, which is a lovely piece of software. We can definitely keep identifying (what we think are) bugs, and if possible, I'll always try to point out where I think they may be creeping in.

One general question: can you possibly explain (or point me towards an explanation of) the 'running out of chromosomes' issue? We've been doing a lot of simulations lately that study small populations (i.e., Ne of perhaps hundreds to low thousands) typically with split/merge histories. We're increasingly finding that our models in this space fail, but we can't really increase migration or the number of sampled chromosomes without deviating from the actual biology of the system. Is this potentially a solvable problem, or do you think that some models just can't be simulated within a sequential Markovian coalescent framework?

I'll send you a follow up email with a real example to clarify...

-Murray

On 3/07/2015, at 4:33 am, gchen98 notifications@github.com wrote:

Hi Murray,

Just for your info, yesterday was a fortunate day for you in terms of fixing the bug. It was the first day of my new job in industry, so I'm still getting acclimated and had a little time on my hands. My time will be more limited than before moving forward. As this is open source, it would be great if you or someone on your team can help share the burden and spot bugs. This could expedite turnaround.

Gary

On 7/2/15 12:01 AM, Murray Cox wrote:

Hi Gary,

Thanks for getting on to this so quickly!

I'm at a two-day meeting and so haven't been able to test your code fix, but I'll put this right up my priority list. I'm really keen to get this working!

More feedback shortly...

-Murray

On 2/07/2015, at 9:43 am, gchen98 notifications@github.com wrote:

Hi Murray,

I've pushed a patch onto Git. Do you want to do another Git pull on the latest source code? This should fix the bug with a bad memory allocation on the migration matrices.

Let me know how this works out. Also, I notice that sometimes with different simulation seeds it works. This is because you have small number of sampled chromosomes, and we run out of chromosomes to pick as we go up the ARG. If you don't want to increase the migration rates, you can maintain your demographic model by scaling up the number in -I and the total haplotypes.

Gary

On 6/30/15 7:30 PM, Murray Cox wrote:

Hi Gary,

I chatted with you about this a while back, but I'm just following up to say that I think MaCS still isn't quite doing split/join events right.

I'm trying to model quite a complex situation with eight populations, but I've managed to isolate the problem down to the following simple example:

Imagine we have just two populations, 1 and 2. At a given time, we split population 1 into two new populations, which under ms rules should be called populations 1 and 3. Slightly later, we split population 2 into two new populations, which should be called populations 2 and 4. We then join these populations together: first 4 to 1, then 3 to 2, then 2 to 1.

There's nothing very complex in this setup. An example command line would be:

macs 120 1e8 -t 4.2e-05 -r 0.00042 -h 1e3 -I 2 60 60 10 -es 0.00388 1 0.741 -es 0.00391 2 0.741 -ej 0.00394 4 1 -ej 0.00395 3 2 -ej 0.00396 2 1 2>trees.txt | msformatter > test.out

However, this fails with the following error:

Graph Builder begin Attempting to add edge of population 3 when the number of available population edge pools is only 2. It is recommended that you increase the migration rates and/or number of sampled chromosomes. Graphbuilder destructor Simulator caught exception with message: Data structure integrity error. Simulator destructor: Configuration destructor Program is complete

In my tests, raising the migration rate and sample size doesn't help - even if I raise them to strange levels (e.g., setting the sample size larger than the population size). MaCS either throws the same error, or a different one:

Infinite coalescent time. No migration

This also doesn't make much sense to me, as all populations are quickly joined together into a single population, whose individuals should then rapidly coalesce.

Can you possibly take a look at this and see what's going on?

-Murray

— Reply to this email directly or view it on GitHub https://github.com/gchen98/macs/issues/31.

— Reply to this email directly or view it on GitHub https://github.com/gchen98/macs/issues/31#issuecomment-117832965.

— Reply to this email directly or view it on GitHub https://github.com/gchen98/macs/issues/31#issuecomment-117936190.

— Reply to this email directly or view it on GitHub https://github.com/gchen98/macs/issues/31#issuecomment-118087579.

mpcox commented 9 years ago

Hi Gary,

Here is a real example.

We want to study the admixture history of Indonesia. We first have a split between Asians and Melanesians 50,000 years ago, then these populations merge 5,000 years ago. Following this merger event, 8 communities form on a particular island of interest.

There are two possible scenarios here. First, that there was a single admixture event, followed by the splitting of the 8 communities. We can model this in MaCS without any errors. However, the second scenario is that the 8 communities had independent origins -- in other words, there were 8 independent admixture events. (In practice, 8 split event, followed by 16 (i.e., 8 x 2) merger events back to the Asian and Melanesian parental populations). These simulations always fail.

With your new code, we can get the test example in my earlier email running. (Or rather, it doesn't run with a sample size of 60 per population, but if you raise this to 1200, then it does work).

However, with our 8 community model above, the simulations always fail. We can't raise the migration rate (because it is actually low and this is the parameter we're testing). And the simulations still fail when we model 6000 individuals per population, which is larger than the populations actually are in real life.

Obviously for 8 communities, the command lines are fairly complex, so I've written a wrapper script to generate the MaCS command line from more easily understood values like split times in generations (attached). The only parameter that can really be changed is -s, which in this example says that each population is of size 60. An example would be:

sumba_admix.pl -s 60 -l 1e8 -mu 1e-9 -r 1e-8 -nAn 10500 -nAs 2050 -nMe 800 -nAd 1435 -pAd 0.74 -pv 0.05 -tAd 163 -tAn 2000 -m 0.01 -mv 0 -a SNP_Ascertainment_16Sep13.txt -o test.out -p

which creates the MaCS command line:

macs 480 1e8 -t 4.2e-05 -r 0.00042 -h 1e3 -F SNP_Ascertainment_16Sep13.txt 0 -I 8 60 60 60 60 60 60 60 60 0 -n 1 0.0170833333333333 -n 2 0.0170833333333333 -n 3 0.0170833333333333 -n 4 0.0170833333333333 -n 5 0.0170833333333333 -n 6 0.0170833333333333 -n 7 0.0170833333333333 -n 8 0.0170833333333333 -m 1 2 420 -m 1 3 420 -m 1 4 420 -m 1 5 420 -m 1 6 420 -m 1 7 420 -m 1 8 420 -m 2 1 420 -m 2 3 420 -m 2 4 420 -m 2 5 420 -m 2 6 420 -m 2 7 420 -m 2 8 420 -m 3 1 420 -m 3 2 420 -m 3 4 420 -m 3 5 420 -m 3 6 420 -m 3 7 420 -m 3 8 420 -m 4 1 420 -m 4 2 420 -m 4 3 420 -m 4 5 420 -m 4 6 420 -m 4 7 420 -m 4 8 420 -m 5 1 420 -m 5 2 420 -m 5 3 420 -m 5 4 420 -m 5 6 420 -m 5 7 420 -m 5 8 420 -m 6 1 420 -m 6 2 420 -m 6 3 420 -m 6 4 420 -m 6 5 420 -m 6 7 420 -m 6 8 420 -m 7 1 420 -m 7 2 420 -m 7 3 420 -m 7 4 420 -m 7 5 420 -m 7 6 420 -m 7 8 420 -m 8 1 420 -m 8 2 420 -m 8 3 420 -m 8 4 420 -m 8 5 420 -m 8 6 420 -m 8 7 420 -es 0.00388095238095238 1 0.739455638305862 -en 0.00388105238095238 1 0.195238095238095 -en 0.00388115238095238 9 0.0761904761904762 -es 0.00388125238095238 2 0.690161967562172 -ej 0.00388135238095238 2 1 -ej 0.00388145238095238 10 9 -es 0.00388155238095238 3 0.774954197383373 -ej 0.00388165238095238 3 1 -ej 0.00388175238095238 11 9 -es 0.00388185238095238 4 0.718771746879169 -ej 0.00388195238095238 4 1 -ej 0.00388205238095238 12 9 -es 0.00388215238095238 5 0.712322541242298 -ej 0.00388225238095238 5 1 -ej 0.00388235238095238 13 9 -es 0.00388245238095238 6 0.769246033824616 -ej 0.00388255238095238 6 1 -ej 0.00388265238095238 14 9 -es 0.00388275238095238 7 0.765559314239749 -ej 0.00388285238095238 7 1 -ej 0.00388295238095238 15 9 -es 0.00388305238095238 8 0.750101814469615 -ej 0.00388315238095238 8 1 -ej 0.00388325238095238 16 9 -eM 0.00388335238095238 0 -ej 0.0476190476190476 9 1 -en 0.0476191476190476 9 0 -en 0.0476192476190476 1 1 2>trees.txt | msformatter > test.out

I guess I'm wondering whether it is actually realistic to model a scenario this complex with MaCS. In many respects, it's not that complex, and it's certainly possible under the standard coalescent. However, perhaps the sequential Markovian coalescent just isn't robust enough? Or alternately, do you think there is something that could be changed in MaCS to make it more robust to this problem of 'running out of chromosomes'?

-Murray

On 3/07/2015, at 4:33 am, gchen98 notifications@github.com wrote:

Hi Murray,

Just for your info, yesterday was a fortunate day for you in terms of fixing the bug. It was the first day of my new job in industry, so I'm still getting acclimated and had a little time on my hands. My time will be more limited than before moving forward. As this is open source, it would be great if you or someone on your team can help share the burden and spot bugs. This could expedite turnaround.

Gary

On 7/2/15 12:01 AM, Murray Cox wrote:

Hi Gary,

Thanks for getting on to this so quickly!

I'm at a two-day meeting and so haven't been able to test your code fix, but I'll put this right up my priority list. I'm really keen to get this working!

More feedback shortly...

-Murray

On 2/07/2015, at 9:43 am, gchen98 notifications@github.com wrote:

Hi Murray,

I've pushed a patch onto Git. Do you want to do another Git pull on the latest source code? This should fix the bug with a bad memory allocation on the migration matrices.

Let me know how this works out. Also, I notice that sometimes with different simulation seeds it works. This is because you have small number of sampled chromosomes, and we run out of chromosomes to pick as we go up the ARG. If you don't want to increase the migration rates, you can maintain your demographic model by scaling up the number in -I and the total haplotypes.

Gary

On 6/30/15 7:30 PM, Murray Cox wrote:

Hi Gary,

I chatted with you about this a while back, but I'm just following up to say that I think MaCS still isn't quite doing split/join events right.

I'm trying to model quite a complex situation with eight populations, but I've managed to isolate the problem down to the following simple example:

Imagine we have just two populations, 1 and 2. At a given time, we split population 1 into two new populations, which under ms rules should be called populations 1 and 3. Slightly later, we split population 2 into two new populations, which should be called populations 2 and 4. We then join these populations together: first 4 to 1, then 3 to 2, then 2 to 1.

There's nothing very complex in this setup. An example command line would be:

macs 120 1e8 -t 4.2e-05 -r 0.00042 -h 1e3 -I 2 60 60 10 -es 0.00388 1 0.741 -es 0.00391 2 0.741 -ej 0.00394 4 1 -ej 0.00395 3 2 -ej 0.00396 2 1 2>trees.txt | msformatter > test.out

However, this fails with the following error:

Graph Builder begin Attempting to add edge of population 3 when the number of available population edge pools is only 2. It is recommended that you increase the migration rates and/or number of sampled chromosomes. Graphbuilder destructor Simulator caught exception with message: Data structure integrity error. Simulator destructor: Configuration destructor Program is complete

In my tests, raising the migration rate and sample size doesn't help - even if I raise them to strange levels (e.g., setting the sample size larger than the population size). MaCS either throws the same error, or a different one:

Infinite coalescent time. No migration

This also doesn't make much sense to me, as all populations are quickly joined together into a single population, whose individuals should then rapidly coalesce.

Can you possibly take a look at this and see what's going on?

-Murray

— Reply to this email directly or view it on GitHub https://github.com/gchen98/macs/issues/31.

— Reply to this email directly or view it on GitHub https://github.com/gchen98/macs/issues/31#issuecomment-117832965.

— Reply to this email directly or view it on GitHub https://github.com/gchen98/macs/issues/31#issuecomment-117936190.

— Reply to this email directly or view it on GitHub https://github.com/gchen98/macs/issues/31#issuecomment-118087579.

gchen98 commented 9 years ago

Hi Murray,

So what you're saying is that with the original model you sent me, you can increase the sample size and things work, but with the alternative population model you can't even increase the sample size? Have you tried just for debugging, increasing both sample size and migration rate?

Gary

On 07/02/2015 05:12 PM, Murray Cox wrote:

Hi Gary,

Congratulations on the new job! I didn't realize you were leaving USC.

Definitely happy to help with this. I've had a number of students move on recently, so as fair warning, the experience level of my group has dipped quite a bit. However, we are actively using MaCS, which is a lovely piece of software. We can definitely keep identifying (what we think are) bugs, and if possible, I'll always try to point out where I think they may be creeping in.

One general question: can you possibly explain (or point me towards an explanation of) the 'running out of chromosomes' issue? We've been doing a lot of simulations lately that study small populations (i.e., Ne of perhaps hundreds to low thousands) typically with split/merge histories. We're increasingly finding that our models in this space fail, but we can't really increase migration or the number of sampled chromosomes without deviating from the actual biology of the system. Is this potentially a solvable problem, or do you think that some models just can't be simulated within a sequential Markovian coalescent framework?

I'll send you a follow up email with a real example to clarify...

-Murray

On 3/07/2015, at 4:33 am, gchen98 notifications@github.com wrote:

Hi Murray,

Just for your info, yesterday was a fortunate day for you in terms of fixing the bug. It was the first day of my new job in industry, so I'm still getting acclimated and had a little time on my hands. My time will be more limited than before moving forward. As this is open source, it would be great if you or someone on your team can help share the burden and spot bugs. This could expedite turnaround.

Gary

On 7/2/15 12:01 AM, Murray Cox wrote:

Hi Gary,

Thanks for getting on to this so quickly!

I'm at a two-day meeting and so haven't been able to test your code fix, but I'll put this right up my priority list. I'm really keen to get this working!

More feedback shortly...

-Murray

On 2/07/2015, at 9:43 am, gchen98 notifications@github.com wrote:

Hi Murray,

I've pushed a patch onto Git. Do you want to do another Git pull on the latest source code? This should fix the bug with a bad memory allocation on the migration matrices.

Let me know how this works out. Also, I notice that sometimes with different simulation seeds it works. This is because you have small number of sampled chromosomes, and we run out of chromosomes to pick as we go up the ARG. If you don't want to increase the migration rates, you can maintain your demographic model by scaling up the number in -I and the total haplotypes.

Gary

On 6/30/15 7:30 PM, Murray Cox wrote:

Hi Gary,

I chatted with you about this a while back, but I'm just following up to say that I think MaCS still isn't quite doing split/join events right.

I'm trying to model quite a complex situation with eight populations, but I've managed to isolate the problem down to the following simple example:

Imagine we have just two populations, 1 and 2. At a given time, we split population 1 into two new populations, which under ms rules should be called populations 1 and 3. Slightly later, we split population 2 into two new populations, which should be called populations 2 and 4. We then join these populations together: first 4 to 1, then 3 to 2, then 2 to 1.

There's nothing very complex in this setup. An example command line would be:

macs 120 1e8 -t 4.2e-05 -r 0.00042 -h 1e3 -I 2 60 60 10 -es 0.00388 1 0.741 -es 0.00391 2 0.741 -ej 0.00394 4 1 -ej 0.00395 3 2 -ej 0.00396 2 1 2>trees.txt | msformatter > test.out

However, this fails with the following error:

Graph Builder begin Attempting to add edge of population 3 when the number of available population edge pools is only 2. It is recommended that you increase the migration rates and/or number of sampled chromosomes. Graphbuilder destructor Simulator caught exception with message: Data structure integrity error. Simulator destructor: Configuration destructor Program is complete

In my tests, raising the migration rate and sample size doesn't help - even if I raise them to strange levels (e.g., setting the sample size larger than the population size). MaCS either throws the same error, or a different one:

Infinite coalescent time. No migration

This also doesn't make much sense to me, as all populations are quickly joined together into a single population, whose individuals should then rapidly coalesce.

Can you possibly take a look at this and see what's going on?

-Murray

— Reply to this email directly or view it on GitHub https://github.com/gchen98/macs/issues/31.

— Reply to this email directly or view it on GitHub https://github.com/gchen98/macs/issues/31#issuecomment-117832965.

— Reply to this email directly or view it on GitHub https://github.com/gchen98/macs/issues/31#issuecomment-117936190.

— Reply to this email directly or view it on GitHub https://github.com/gchen98/macs/issues/31#issuecomment-118087579.

— Reply to this email directly or view it on GitHub https://github.com/gchen98/macs/issues/31#issuecomment-118198864.

mpcox commented 9 years ago

Hi Gary,

Yes, that's right. Even the combination of raising sample size and migration fails.

I can get the original test example (in the email chain below) to work if I raise the sample size.

However, with the 8 population model, MaCS still dies with the 'out of edges' error, even if I raise the sample size in magnitudes (60 -> 600 -> 6000 per population). The same holds true if I raise the migration rate in magnitudes (420 -> 4200). It also still fails if these values are both raised in the same simulation (e.g., 6000 with 4200). To put this in perspective, 4200 equates to ~10% of people moving per generation, so Nm >> 1. At these levels, the 8 populations should actually be acting as one big panmictic population.

For fun, I reduced the real 8 population problem down to just 2 populations. (In other settings, pairwise models have been very powerful in population genetics). Since I could get the test example working (and it contains just two populations), I hoped I could get this real 2 population model working. The wrapper script to make the MaCS command lines is attached.

However, this still fails with the same edge error. In the end, I was trying a sample size of 20,000 per population (far bigger than the actual size) and a migration rate of 4200 (10% of people moving per generation, again far bigger than reality). This gave the edges error too:

./macs 40000 1e8 -t 4.2e-05 -r 0.00042 -h 1e3 -F SNP_Ascertainment_16Sep13.txt 0 -I 2 20000 20000 0 -n 1 0.0171428571428571 -n 2 0.0171428571428571 -m 1 2 4200 -m 2 1 4200 -es 0.00388095238095238 1 0.706506157269033 -en 0.00388105238095238 1 0.195238095238095 -en 0.00388115238095238 3 0.0761904761904762 -es 0.00388125238095238 2 0.73709411917515 -ej 0.00388145238095238 2 1 -ej 0.00388155238095238 4 3 -eM 0.00388185238095238 0 -ej 0.0476190476190476 3 1 -en 0.0476192476190476 1 1

I stripped out all the extraneous commands (like the -F flag) and played around with the -h flag, just in case this mattered. However, there was no difference in the outcome.

I hope this is helpful for figuring out what might be going on. Although I have a reasonable theoretical understanding of the SMC, I don't know the details of the implementation (even after looking at your code). As another potential source of error, the command lines get complex fast, but I've checked them over by hand several times and they do seem to be right. If you notice anything wrong though, definitely let me know...!

-Murray

On 5/07/2015, at 6:27 am, Gary K. Chen notifications@github.com wrote:

Hi Murray,

So what you're saying is that with the original model you sent me, you can increase the sample size and things work, but with the alternative population model you can't even increase the sample size? Have you tried just for debugging, increasing both sample size and migration rate?

Gary

On 07/02/2015 05:12 PM, Murray Cox wrote:

Hi Gary,

Congratulations on the new job! I didn't realize you were leaving USC.

Definitely happy to help with this. I've had a number of students move on recently, so as fair warning, the experience level of my group has dipped quite a bit. However, we are actively using MaCS, which is a lovely piece of software. We can definitely keep identifying (what we think are) bugs, and if possible, I'll always try to point out where I think they may be creeping in.

One general question: can you possibly explain (or point me towards an explanation of) the 'running out of chromosomes' issue? We've been doing a lot of simulations lately that study small populations (i.e., Ne of perhaps hundreds to low thousands) typically with split/merge histories. We're increasingly finding that our models in this space fail, but we can't really increase migration or the number of sampled chromosomes without deviating from the actual biology of the system. Is this potentially a solvable problem, or do you think that some models just can't be simulated within a sequential Markovian coalescent framework?

I'll send you a follow up email with a real example to clarify...

-Murray

On 3/07/2015, at 4:33 am, gchen98 notifications@github.com wrote:

Hi Murray,

Just for your info, yesterday was a fortunate day for you in terms of fixing the bug. It was the first day of my new job in industry, so I'm still getting acclimated and had a little time on my hands. My time will be more limited than before moving forward. As this is open source, it would be great if you or someone on your team can help share the burden and spot bugs. This could expedite turnaround.

Gary

On 7/2/15 12:01 AM, Murray Cox wrote:

Hi Gary,

Thanks for getting on to this so quickly!

I'm at a two-day meeting and so haven't been able to test your code fix, but I'll put this right up my priority list. I'm really keen to get this working!

More feedback shortly...

-Murray

On 2/07/2015, at 9:43 am, gchen98 notifications@github.com wrote:

Hi Murray,

I've pushed a patch onto Git. Do you want to do another Git pull on the latest source code? This should fix the bug with a bad memory allocation on the migration matrices.

Let me know how this works out. Also, I notice that sometimes with different simulation seeds it works. This is because you have small number of sampled chromosomes, and we run out of chromosomes to pick as we go up the ARG. If you don't want to increase the migration rates, you can maintain your demographic model by scaling up the number in -I and the total haplotypes.

Gary

On 6/30/15 7:30 PM, Murray Cox wrote:

Hi Gary,

I chatted with you about this a while back, but I'm just following up to say that I think MaCS still isn't quite doing split/join events right.

I'm trying to model quite a complex situation with eight populations, but I've managed to isolate the problem down to the following simple example:

Imagine we have just two populations, 1 and 2. At a given time, we split population 1 into two new populations, which under ms rules should be called populations 1 and 3. Slightly later, we split population 2 into two new populations, which should be called populations 2 and 4. We then join these populations together: first 4 to 1, then 3 to 2, then 2 to 1.

There's nothing very complex in this setup. An example command line would be:

macs 120 1e8 -t 4.2e-05 -r 0.00042 -h 1e3 -I 2 60 60 10 -es 0.00388 1 0.741 -es 0.00391 2 0.741 -ej 0.00394 4 1 -ej 0.00395 3 2 -ej 0.00396 2 1 2>trees.txt | msformatter > test.out

However, this fails with the following error:

Graph Builder begin Attempting to add edge of population 3 when the number of available population edge pools is only 2. It is recommended that you increase the migration rates and/or number of sampled chromosomes. Graphbuilder destructor Simulator caught exception with message: Data structure integrity error. Simulator destructor: Configuration destructor Program is complete

In my tests, raising the migration rate and sample size doesn't help - even if I raise them to strange levels (e.g., setting the sample size larger than the population size). MaCS either throws the same error, or a different one:

Infinite coalescent time. No migration

This also doesn't make much sense to me, as all populations are quickly joined together into a single population, whose individuals should then rapidly coalesce.

Can you possibly take a look at this and see what's going on?

-Murray

— Reply to this email directly or view it on GitHub https://github.com/gchen98/macs/issues/31.

— Reply to this email directly or view it on GitHub https://github.com/gchen98/macs/issues/31#issuecomment-117832965.

— Reply to this email directly or view it on GitHub https://github.com/gchen98/macs/issues/31#issuecomment-117936190.

— Reply to this email directly or view it on GitHub https://github.com/gchen98/macs/issues/31#issuecomment-118087579.

— Reply to this email directly or view it on GitHub https://github.com/gchen98/macs/issues/31#issuecomment-118198864.

— Reply to this email directly or view it on GitHub https://github.com/gchen98/macs/issues/31#issuecomment-118540211.

mpcox commented 9 years ago

Hi Gary,

A further thought.

If I had to generalize the results of all my testing, I think I would say this:

MaCS works fine with any number of join events
MaCS also works fine with one split event
However, MaCS throws the 'edge' error whenever I have 2 or more split events

This latter point can be fixed with high s and m, but only for the very simplest models.

I don't know if this is helpful, but perhaps it might give you some thoughts on where to start looking for the underlying issue?

-Murray

Hi Gary,

Yes, that's right. Even the combination of raising sample size and migration fails.

I can get the original test example (in the email chain below) to work if I raise the sample size.

However, with the 8 population model, MaCS still dies with the 'out of edges' error, even if I raise the sample size in magnitudes (60 -> 600 -> 6000 per population). The same holds true if I raise the migration rate in magnitudes (420 -> 4200). It also still fails if these values are both raised in the same simulation (e.g., 6000 with 4200). To put this in perspective, 4200 equates to ~10% of people moving per generation, so Nm >> 1. At these levels, the 8 populations should actually be acting as one big panmictic population.

For fun, I reduced the real 8 population problem down to just 2 populations. (In other settings, pairwise models have been very powerful in population genetics). Since I could get the test example working (and it contains just two populations), I hoped I could get this real 2 population model working. The wrapper script to make the MaCS command lines is attached.

However, this still fails with the same edge error. In the end, I was trying a sample size of 20,000 per population (far bigger than the actual size) and a migration rate of 4200 (10% of people moving per generation, again far bigger than reality). This gave the edges error too:

./macs 40000 1e8 -t 4.2e-05 -r 0.00042 -h 1e3 -F SNP_Ascertainment_16Sep13.txt 0 -I 2 20000 20000 0 -n 1 0.0171428571428571 -n 2 0.0171428571428571 -m 1 2 4200 -m 2 1 4200 -es 0.00388095238095238 1 0.706506157269033 -en 0.00388105238095238 1 0.195238095238095 -en 0.00388115238095238 3 0.0761904761904762 -es 0.00388125238095238 2 0.73709411917515 -ej 0.00388145238095238 2 1 -ej 0.00388155238095238 4 3 -eM 0.00388185238095238 0 -ej 0.0476190476190476 3 1 -en 0.0476192476190476 1 1

I stripped out all the extraneous commands (like the -F flag) and played around with the -h flag, just in case this mattered. However, there was no difference in the outcome.

I hope this is helpful for figuring out what might be going on. Although I have a reasonable theoretical understanding of the SMC, I don't know the details of the implementation (even after looking at your code). As another potential source of error, the command lines get complex fast, but I've checked them over by hand several times and they do seem to be right. If you notice anything wrong though, definitely let me know...!

-Murray

On 5/07/2015, at 6:27 am, Gary K. Chen <notifications@github.com mailto:notifications@github.com> wrote:

Hi Murray,

So what you're saying is that with the original model you sent me, you can increase the sample size and things work, but with the alternative population model you can't even increase the sample size? Have you tried just for debugging, increasing both sample size and migration rate?

Gary

On 07/02/2015 05:12 PM, Murray Cox wrote:

Hi Gary,

Congratulations on the new job! I didn't realize you were leaving USC.

Definitely happy to help with this. I've had a number of students move on recently, so as fair warning, the experience level of my group has dipped quite a bit. However, we are actively using MaCS, which is a lovely piece of software. We can definitely keep identifying (what we think are) bugs, and if possible, I'll always try to point out where I think they may be creeping in.

One general question: can you possibly explain (or point me towards an explanation of) the 'running out of chromosomes' issue? We've been doing a lot of simulations lately that study small populations (i.e., Ne of perhaps hundreds to low thousands) typically with split/merge histories. We're increasingly finding that our models in this space fail, but we can't really increase migration or the number of sampled chromosomes without deviating from the actual biology of the system. Is this potentially a solvable problem, or do you think that some models just can't be simulated within a sequential Markovian coalescent framework?

I'll send you a follow up email with a real example to clarify...

-Murray

On 3/07/2015, at 4:33 am, gchen98 <notifications@github.com mailto:notifications@github.com> wrote:

Hi Murray,

Just for your info, yesterday was a fortunate day for you in terms of fixing the bug. It was the first day of my new job in industry, so I'm still getting acclimated and had a little time on my hands. My time will be more limited than before moving forward. As this is open source, it would be great if you or someone on your team can help share the burden and spot bugs. This could expedite turnaround.

Gary

On 7/2/15 12:01 AM, Murray Cox wrote:

Hi Gary,

Thanks for getting on to this so quickly!

I'm at a two-day meeting and so haven't been able to test your code fix, but I'll put this right up my priority list. I'm really keen to get this working!

More feedback shortly...

-Murray

On 2/07/2015, at 9:43 am, gchen98 <notifications@github.com mailto:notifications@github.com> wrote:

Hi Murray,

I've pushed a patch onto Git. Do you want to do another Git pull on the latest source code? This should fix the bug with a bad memory allocation on the migration matrices.

Let me know how this works out. Also, I notice that sometimes with different simulation seeds it works. This is because you have small number of sampled chromosomes, and we run out of chromosomes to pick as we go up the ARG. If you don't want to increase the migration rates, you can maintain your demographic model by scaling up the number in -I and the total haplotypes.

Gary

On 6/30/15 7:30 PM, Murray Cox wrote:

Hi Gary,

I chatted with you about this a while back, but I'm just following up to say that I think MaCS still isn't quite doing split/join events right.

I'm trying to model quite a complex situation with eight populations, but I've managed to isolate the problem down to the following simple example:

Imagine we have just two populations, 1 and 2. At a given time, we split population 1 into two new populations, which under ms rules should be called populations 1 and 3. Slightly later, we split population 2 into two new populations, which should be called populations 2 and 4. We then join these populations together: first 4 to 1, then 3 to 2, then 2 to 1.

There's nothing very complex in this setup. An example command line would be:

macs 120 1e8 -t 4.2e-05 -r 0.00042 -h 1e3 -I 2 60 60 10 -es 0.00388 1 0.741 -es 0.00391 2 0.741 -ej 0.00394 4 1 -ej 0.00395 3 2 -ej 0.00396 2 1 2>trees.txt | msformatter > test.out

However, this fails with the following error:

Graph Builder begin Attempting to add edge of population 3 when the number of available population edge pools is only 2. It is recommended that you increase the migration rates and/or number of sampled chromosomes. Graphbuilder destructor Simulator caught exception with message: Data structure integrity error. Simulator destructor: Configuration destructor Program is complete

In my tests, raising the migration rate and sample size doesn't help - even if I raise them to strange levels (e.g., setting the sample size larger than the population size). MaCS either throws the same error, or a different one:

Infinite coalescent time. No migration

This also doesn't make much sense to me, as all populations are quickly joined together into a single population, whose individuals should then rapidly coalesce.

Can you possibly take a look at this and see what's going on?

-Murray

— Reply to this email directly or view it on GitHub <https://github.com/gchen98/macs/issues/31 https://github.com/gchen98/macs/issues/31>.

— Reply to this email directly or view it on GitHub <https://github.com/gchen98/macs/issues/31#issuecomment-117832965 https://github.com/gchen98/macs/issues/31#issuecomment-117832965>.

— Reply to this email directly or view it on GitHub <https://github.com/gchen98/macs/issues/31#issuecomment-117936190 https://github.com/gchen98/macs/issues/31#issuecomment-117936190>.

— Reply to this email directly or view it on GitHub <https://github.com/gchen98/macs/issues/31#issuecomment-118087579 https://github.com/gchen98/macs/issues/31#issuecomment-118087579>.

— Reply to this email directly or view it on GitHub <https://github.com/gchen98/macs/issues/31#issuecomment-118198864 https://github.com/gchen98/macs/issues/31#issuecomment-118198864>.

— Reply to this email directly or view it on GitHub https://github.com/gchen98/macs/issues/31#issuecomment-118540211.

gchen98 commented 9 years ago

I think I've isolated the bug and will work on it when I get some more time, stay tuned.

On 07/04/2015 02:15 PM, Murray Cox wrote:

Hi Gary,

A further thought.

If I had to generalize the results of all my testing, I think I would say this:

MaCS works fine with any number of join events

MaCS also works fine with one split event

However, MaCS throws the 'edge' error whenever I have 2 or more split events

This latter point can be fixed with high s and m, but only for the very simplest models.

I don't know if this is helpful, but perhaps it might give you some thoughts on where to start looking for the underlying issue?

-Murray

Hi Gary,

Yes, that's right. Even the combination of raising sample size and migration fails.

I can get the original test example (in the email chain below) to work if I raise the sample size.

However, with the 8 population model, MaCS still dies with the 'out of edges' error, even if I raise the sample size in magnitudes (60 -> 600 -> 6000 per population). The same holds true if I raise the migration rate in magnitudes (420 -> 4200). It also still fails if these values are both raised in the same simulation (e.g., 6000 with 4200). To put this in perspective, 4200 equates to ~10% of people moving per generation, so Nm >> 1. At these levels, the 8 populations should actually be acting as one big panmictic population.

For fun, I reduced the real 8 population problem down to just 2 populations. (In other settings, pairwise models have been very powerful in population genetics). Since I could get the test example working (and it contains just two populations), I hoped I could get this real 2 population model working. The wrapper script to make the MaCS command lines is attached.

However, this still fails with the same edge error. In the end, I was trying a sample size of 20,000 per population (far bigger than the actual size) and a migration rate of 4200 (10% of people moving per generation, again far bigger than reality). This gave the edges error too:

./macs 40000 1e8 -t 4.2e-05 -r 0.00042 -h 1e3 -F SNP_Ascertainment_16Sep13.txt 0 -I 2 20000 20000 0 -n 1 0.0171428571428571 -n 2 0.0171428571428571 -m 1 2 4200 -m 2 1 4200 -es 0.00388095238095238 1 0.706506157269033 -en 0.00388105238095238 1 0.195238095238095 -en 0.00388115238095238 3 0.0761904761904762 -es 0.00388125238095238 2 0.73709411917515 -ej 0.00388145238095238 2 1 -ej 0.00388155238095238 4 3 -eM 0.00388185238095238 0 -ej 0.0476190476190476 3 1 -en 0.0476192476190476 1 1

I stripped out all the extraneous commands (like the -F flag) and played around with the -h flag, just in case this mattered. However, there was no difference in the outcome.

I hope this is helpful for figuring out what might be going on. Although I have a reasonable theoretical understanding of the SMC, I don't know the details of the implementation (even after looking at your code). As another potential source of error, the command lines get complex fast, but I've checked them over by hand several times and they do seem to be right. If you notice anything wrong though, definitely let me know...!

-Murray

On 5/07/2015, at 6:27 am, Gary K. Chen <notifications@github.com mailto:notifications@github.com> wrote:

Hi Murray,

So what you're saying is that with the original model you sent me, you can increase the sample size and things work, but with the alternative population model you can't even increase the sample size? Have you tried just for debugging, increasing both sample size and migration rate?

Gary

On 07/02/2015 05:12 PM, Murray Cox wrote:

Hi Gary,

Congratulations on the new job! I didn't realize you were leaving USC.

Definitely happy to help with this. I've had a number of students move on recently, so as fair warning, the experience level of my group has dipped quite a bit. However, we are actively using MaCS, which is a lovely piece of software. We can definitely keep identifying (what we think are) bugs, and if possible, I'll always try to point out where I think they may be creeping in.

One general question: can you possibly explain (or point me towards an explanation of) the 'running out of chromosomes' issue? We've been doing a lot of simulations lately that study small populations (i.e., Ne of perhaps hundreds to low thousands) typically with split/merge histories. We're increasingly finding that our models in this space fail, but we can't really increase migration or the number of sampled chromosomes without deviating from the actual biology of the system. Is this potentially a solvable problem, or do you think that some models just can't be simulated within a sequential Markovian coalescent framework?

I'll send you a follow up email with a real example to clarify...

-Murray

On 3/07/2015, at 4:33 am, gchen98 <notifications@github.com mailto:notifications@github.com> wrote:

Hi Murray,

Just for your info, yesterday was a fortunate day for you in terms of fixing the bug. It was the first day of my new job in industry, so I'm still getting acclimated and had a little time on my hands. My time will be more limited than before moving forward. As this is open source, it would be great if you or someone on your team can help share the burden and spot bugs. This could expedite turnaround.

Gary

On 7/2/15 12:01 AM, Murray Cox wrote:

Hi Gary,

Thanks for getting on to this so quickly!

I'm at a two-day meeting and so haven't been able to test your code fix, but I'll put this right up my priority list. I'm really keen to get this working!

More feedback shortly...

-Murray

On 2/07/2015, at 9:43 am, gchen98 <notifications@github.com mailto:notifications@github.com> wrote:

Hi Murray,

I've pushed a patch onto Git. Do you want to do another Git pull on the latest source code? This should fix the bug with a bad memory allocation on the migration matrices.

Let me know how this works out. Also, I notice that sometimes with different simulation seeds it works. This is because you have small number of sampled chromosomes, and we run out of chromosomes to pick as we go up the ARG. If you don't want to increase the migration rates, you can maintain your demographic model by scaling up the number in -I and the total haplotypes.

Gary

On 6/30/15 7:30 PM, Murray Cox wrote:

Hi Gary,

I chatted with you about this a while back, but I'm just following up to say that I think MaCS still isn't quite doing split/join events right.

I'm trying to model quite a complex situation with eight populations, but I've managed to isolate the problem down to the following simple example:

Imagine we have just two populations, 1 and 2. At a given time, we split population 1 into two new populations, which under ms rules should be called populations 1 and 3. Slightly later, we split population 2 into two new populations, which should be called populations 2 and 4. We then join these populations together: first 4 to 1, then 3 to 2, then 2 to 1.

There's nothing very complex in this setup. An example command line would be:

macs 120 1e8 -t 4.2e-05 -r 0.00042 -h 1e3 -I 2 60 60 10 -es 0.00388 1 0.741 -es 0.00391 2 0.741 -ej 0.00394 4 1 -ej 0.00395 3 2 -ej 0.00396 2 1 2>trees.txt | msformatter > test.out

However, this fails with the following error:

Graph Builder begin Attempting to add edge of population 3 when the number of available population edge pools is only 2. It is recommended that you increase the migration rates and/or number of sampled chromosomes. Graphbuilder destructor Simulator caught exception with message: Data structure integrity error. Simulator destructor: Configuration destructor Program is complete

In my tests, raising the migration rate and sample size doesn't help - even if I raise them to strange levels (e.g., setting the sample size larger than the population size). MaCS either throws the same error, or a different one:

Infinite coalescent time. No migration

This also doesn't make much sense to me, as all populations are quickly joined together into a single population, whose individuals should then rapidly coalesce.

Can you possibly take a look at this and see what's going on?

-Murray

— Reply to this email directly or view it on GitHub https://github.com/gchen98/macs/issues/31 <https://github.com/gchen98/macs/issues/31>.

— Reply to this email directly or view it on GitHub

https://github.com/gchen98/macs/issues/31#issuecomment-117832965 <https://github.com/gchen98/macs/issues/31#issuecomment-117832965>.

— Reply to this email directly or view it on GitHub

https://github.com/gchen98/macs/issues/31#issuecomment-117936190 <https://github.com/gchen98/macs/issues/31#issuecomment-117936190>.

— Reply to this email directly or view it on GitHub https://github.com/gchen98/macs/issues/31#issuecomment-118087579 <https://github.com/gchen98/macs/issues/31#issuecomment-118087579>.

— Reply to this email directly or view it on GitHub https://github.com/gchen98/macs/issues/31#issuecomment-118198864 <https://github.com/gchen98/macs/issues/31#issuecomment-118198864>.

— Reply to this email directly or view it on GitHub https://github.com/gchen98/macs/issues/31#issuecomment-118540211.

— Reply to this email directly or view it on GitHub https://github.com/gchen98/macs/issues/31#issuecomment-118555983.

mpcox commented 9 years ago

Hi Gary,

Hey, that's fantastic!

I now have a large set of test cases that I can run any modified code on. Just let me know how I can help.

-Murray

On 5/07/2015, at 2:07 pm, Gary K. Chen notifications@github.com wrote:

I think I've isolated the bug and will work on it when I get some more time, stay tuned.

On 07/04/2015 02:15 PM, Murray Cox wrote:

Hi Gary,

A further thought.

If I had to generalize the results of all my testing, I think I would say this:

MaCS works fine with any number of join events

MaCS also works fine with one split event

However, MaCS throws the 'edge' error whenever I have 2 or more split events

This latter point can be fixed with high s and m, but only for the very simplest models.

I don't know if this is helpful, but perhaps it might give you some thoughts on where to start looking for the underlying issue?

-Murray

Hi Gary,

Yes, that's right. Even the combination of raising sample size and migration fails.

I can get the original test example (in the email chain below) to work if I raise the sample size.

However, with the 8 population model, MaCS still dies with the 'out of edges' error, even if I raise the sample size in magnitudes (60 -> 600 -> 6000 per population). The same holds true if I raise the migration rate in magnitudes (420 -> 4200). It also still fails if these values are both raised in the same simulation (e.g., 6000 with 4200). To put this in perspective, 4200 equates to ~10% of people moving per generation, so Nm >> 1. At these levels, the 8 populations should actually be acting as one big panmictic population.

For fun, I reduced the real 8 population problem down to just 2 populations. (In other settings, pairwise models have been very powerful in population genetics). Since I could get the test example working (and it contains just two populations), I hoped I could get this real 2 population model working. The wrapper script to make the MaCS command lines is attached.

However, this still fails with the same edge error. In the end, I was trying a sample size of 20,000 per population (far bigger than the actual size) and a migration rate of 4200 (10% of people moving per generation, again far bigger than reality). This gave the edges error too:

./macs 40000 1e8 -t 4.2e-05 -r 0.00042 -h 1e3 -F SNP_Ascertainment_16Sep13.txt 0 -I 2 20000 20000 0 -n 1 0.0171428571428571 -n 2 0.0171428571428571 -m 1 2 4200 -m 2 1 4200 -es 0.00388095238095238 1 0.706506157269033 -en 0.00388105238095238 1 0.195238095238095 -en 0.00388115238095238 3 0.0761904761904762 -es 0.00388125238095238 2 0.73709411917515 -ej 0.00388145238095238 2 1 -ej 0.00388155238095238 4 3 -eM 0.00388185238095238 0 -ej 0.0476190476190476 3 1 -en 0.0476192476190476 1 1

I stripped out all the extraneous commands (like the -F flag) and played around with the -h flag, just in case this mattered. However, there was no difference in the outcome.

I hope this is helpful for figuring out what might be going on. Although I have a reasonable theoretical understanding of the SMC, I don't know the details of the implementation (even after looking at your code). As another potential source of error, the command lines get complex fast, but I've checked them over by hand several times and they do seem to be right. If you notice anything wrong though, definitely let me know...!

-Murray

On 5/07/2015, at 6:27 am, Gary K. Chen <notifications@github.com mailto:notifications@github.com> wrote:

Hi Murray,

So what you're saying is that with the original model you sent me, you can increase the sample size and things work, but with the alternative population model you can't even increase the sample size? Have you tried just for debugging, increasing both sample size and migration rate?

Gary

On 07/02/2015 05:12 PM, Murray Cox wrote:

Hi Gary,

Congratulations on the new job! I didn't realize you were leaving USC.

Definitely happy to help with this. I've had a number of students move on recently, so as fair warning, the experience level of my group has dipped quite a bit. However, we are actively using MaCS, which is a lovely piece of software. We can definitely keep identifying (what we think are) bugs, and if possible, I'll always try to point out where I think they may be creeping in.

One general question: can you possibly explain (or point me towards an explanation of) the 'running out of chromosomes' issue? We've been doing a lot of simulations lately that study small populations (i.e., Ne of perhaps hundreds to low thousands) typically with split/merge histories. We're increasingly finding that our models in this space fail, but we can't really increase migration or the number of sampled chromosomes without deviating from the actual biology of the system. Is this potentially a solvable problem, or do you think that some models just can't be simulated within a sequential Markovian coalescent framework?

I'll send you a follow up email with a real example to clarify...

-Murray

On 3/07/2015, at 4:33 am, gchen98 <notifications@github.com mailto:notifications@github.com> wrote:

Hi Murray,

Just for your info, yesterday was a fortunate day for you in terms of fixing the bug. It was the first day of my new job in industry, so I'm still getting acclimated and had a little time on my hands. My time will be more limited than before moving forward. As this is open source, it would be great if you or someone on your team can help share the burden and spot bugs. This could expedite turnaround.

Gary

On 7/2/15 12:01 AM, Murray Cox wrote:

Hi Gary,

Thanks for getting on to this so quickly!

I'm at a two-day meeting and so haven't been able to test your code fix, but I'll put this right up my priority list. I'm really keen to get this working!

More feedback shortly...

-Murray

On 2/07/2015, at 9:43 am, gchen98 <notifications@github.com mailto:notifications@github.com> wrote:

Hi Murray,

I've pushed a patch onto Git. Do you want to do another Git pull on the latest source code? This should fix the bug with a bad memory allocation on the migration matrices.

Let me know how this works out. Also, I notice that sometimes with different simulation seeds it works. This is because you have small number of sampled chromosomes, and we run out of chromosomes to pick as we go up the ARG. If you don't want to increase the migration rates, you can maintain your demographic model by scaling up the number in -I and the total haplotypes.

Gary

On 6/30/15 7:30 PM, Murray Cox wrote:

Hi Gary,

I chatted with you about this a while back, but I'm just following up to say that I think MaCS still isn't quite doing split/join events right.

I'm trying to model quite a complex situation with eight populations, but I've managed to isolate the problem down to the following simple example:

Imagine we have just two populations, 1 and 2. At a given time, we split population 1 into two new populations, which under ms rules should be called populations 1 and 3. Slightly later, we split population 2 into two new populations, which should be called populations 2 and 4. We then join these populations together: first 4 to 1, then 3 to 2, then 2 to 1.

There's nothing very complex in this setup. An example command line would be:

macs 120 1e8 -t 4.2e-05 -r 0.00042 -h 1e3 -I 2 60 60 10 -es 0.00388 1 0.741 -es 0.00391 2 0.741 -ej 0.00394 4 1 -ej 0.00395 3 2 -ej 0.00396 2 1 2>trees.txt | msformatter > test.out

However, this fails with the following error:

Graph Builder begin Attempting to add edge of population 3 when the number of available population edge pools is only 2. It is recommended that you increase the migration rates and/or number of sampled chromosomes. Graphbuilder destructor Simulator caught exception with message: Data structure integrity error. Simulator destructor: Configuration destructor Program is complete

In my tests, raising the migration rate and sample size doesn't help - even if I raise them to strange levels (e.g., setting the sample size larger than the population size). MaCS either throws the same error, or a different one:

Infinite coalescent time. No migration

This also doesn't make much sense to me, as all populations are quickly joined together into a single population, whose individuals should then rapidly coalesce.

Can you possibly take a look at this and see what's going on?

-Murray

— Reply to this email directly or view it on GitHub https://github.com/gchen98/macs/issues/31 <https://github.com/gchen98/macs/issues/31>.

— Reply to this email directly or view it on GitHub

https://github.com/gchen98/macs/issues/31#issuecomment-117832965 <https://github.com/gchen98/macs/issues/31#issuecomment-117832965>.

— Reply to this email directly or view it on GitHub

https://github.com/gchen98/macs/issues/31#issuecomment-117936190 <https://github.com/gchen98/macs/issues/31#issuecomment-117936190>.

— Reply to this email directly or view it on GitHub https://github.com/gchen98/macs/issues/31#issuecomment-118087579 <https://github.com/gchen98/macs/issues/31#issuecomment-118087579>.

— Reply to this email directly or view it on GitHub https://github.com/gchen98/macs/issues/31#issuecomment-118198864 <https://github.com/gchen98/macs/issues/31#issuecomment-118198864>.

— Reply to this email directly or view it on GitHub https://github.com/gchen98/macs/issues/31#issuecomment-118540211.

— Reply to this email directly or view it on GitHub https://github.com/gchen98/macs/issues/31#issuecomment-118555983.

— Reply to this email directly or view it on GitHub https://github.com/gchen98/macs/issues/31#issuecomment-118570350.

gchen98 commented 9 years ago

Murray I've made a bunch of changes and pushed them onto Git. I ensured that pop split events happened at the right time (they were happening a little out of sequence). It looks like your latter 8 pop system with 60 haps per pop seems to work on my machine.

Some tips:

Right after the birth of a new population (i.e. -es event), you might want to put in a -eM or related migration change event at some epsilon (to maintain ordering) after the -es event to make sure the population members move about. For example if -es is at time 0.0001 you can make -eM time 0.0001000001. By default the entry in the migration matrix for the new pop is of the value of the last entry of the -I flag which may or may not be your intent.

Also, adding the -d debugging flag can help you see what's going in case of crashes.

Gary

On 07/04/2015 08:03 PM, Murray Cox wrote:

Hi Gary,

Hey, that's fantastic!

I now have a large set of test cases that I can run any modified code on. Just let me know how I can help.

-Murray

On 5/07/2015, at 2:07 pm, Gary K. Chen notifications@github.com wrote:

I think I've isolated the bug and will work on it when I get some more time, stay tuned.

On 07/04/2015 02:15 PM, Murray Cox wrote:

Hi Gary,

A further thought.

If I had to generalize the results of all my testing, I think I would say this:

MaCS works fine with any number of join events

MaCS also works fine with one split event

However, MaCS throws the 'edge' error whenever I have 2 or more split events

This latter point can be fixed with high s and m, but only for the very simplest models.

I don't know if this is helpful, but perhaps it might give you some thoughts on where to start looking for the underlying issue?

-Murray

Hi Gary,

Yes, that's right. Even the combination of raising sample size and migration fails.

I can get the original test example (in the email chain below) to work if I raise the sample size.

However, with the 8 population model, MaCS still dies with the 'out of edges' error, even if I raise the sample size in magnitudes (60 -> 600 -> 6000 per population). The same holds true if I raise the migration rate in magnitudes (420 -> 4200). It also still fails if these values are both raised in the same simulation (e.g., 6000 with 4200). To put this in perspective, 4200 equates to ~10% of people moving per generation, so Nm >> 1. At these levels, the 8 populations should actually be acting as one big panmictic population.

For fun, I reduced the real 8 population problem down to just 2 populations. (In other settings, pairwise models have been very powerful in population genetics). Since I could get the test example working (and it contains just two populations), I hoped I could get this real 2 population model working. The wrapper script to make the MaCS command lines is attached.

However, this still fails with the same edge error. In the end, I was trying a sample size of 20,000 per population (far bigger than the actual size) and a migration rate of 4200 (10% of people moving per generation, again far bigger than reality). This gave the edges error too:

./macs 40000 1e8 -t 4.2e-05 -r 0.00042 -h 1e3 -F SNP_Ascertainment_16Sep13.txt 0 -I 2 20000 20000 0 -n 1 0.0171428571428571 -n 2 0.0171428571428571 -m 1 2 4200 -m 2 1 4200 -es 0.00388095238095238 1 0.706506157269033 -en 0.00388105238095238 1 0.195238095238095 -en 0.00388115238095238 3 0.0761904761904762 -es 0.00388125238095238 2 0.73709411917515 -ej 0.00388145238095238 2 1 -ej 0.00388155238095238 4 3 -eM 0.00388185238095238 0 -ej 0.0476190476190476 3 1 -en 0.0476192476190476 1 1

I stripped out all the extraneous commands (like the -F flag) and played around with the -h flag, just in case this mattered. However, there was no difference in the outcome.

I hope this is helpful for figuring out what might be going on. Although I have a reasonable theoretical understanding of the SMC, I don't know the details of the implementation (even after looking at your code). As another potential source of error, the command lines get complex fast, but I've checked them over by hand several times and they do seem to be right. If you notice anything wrong though, definitely let me know...!

-Murray

On 5/07/2015, at 6:27 am, Gary K. Chen <notifications@github.com mailto:notifications@github.com> wrote:

Hi Murray,

So what you're saying is that with the original model you sent me, you can increase the sample size and things work, but with the alternative population model you can't even increase the sample size? Have you tried just for debugging, increasing both sample size and migration rate?

Gary

On 07/02/2015 05:12 PM, Murray Cox wrote:

Hi Gary,

Congratulations on the new job! I didn't realize you were leaving USC.

Definitely happy to help with this. I've had a number of students move on recently, so as fair warning, the experience level of my group has dipped quite a bit. However, we are actively using MaCS, which is a lovely piece of software. We can definitely keep identifying (what we think are) bugs, and if possible, I'll always try to point out where I think they may be creeping in.

One general question: can you possibly explain (or point me towards an explanation of) the 'running out of chromosomes' issue? We've been doing a lot of simulations lately that study small populations (i.e., Ne of perhaps hundreds to low thousands) typically with split/merge histories. We're increasingly finding that our models in this space fail, but we can't really increase migration or the number of sampled chromosomes without deviating from the actual biology of the system. Is this potentially a solvable problem, or do you think that some models just can't be simulated within a sequential Markovian coalescent framework?

I'll send you a follow up email with a real example to clarify...

-Murray

On 3/07/2015, at 4:33 am, gchen98 <notifications@github.com mailto:notifications@github.com> wrote:

Hi Murray,

Just for your info, yesterday was a fortunate day for you in terms of fixing the bug. It was the first day of my new job in industry, so I'm still getting acclimated and had a little time on my hands. My time will be more limited than before moving forward. As this is open source, it would be great if you or someone on your team can help share the burden and spot bugs. This could expedite turnaround.

Gary

On 7/2/15 12:01 AM, Murray Cox wrote:

Hi Gary,

Thanks for getting on to this so quickly!

I'm at a two-day meeting and so haven't been able to test your code fix, but I'll put this right up my priority list. I'm really keen to get this working!

More feedback shortly...

-Murray

On 2/07/2015, at 9:43 am, gchen98 <notifications@github.com mailto:notifications@github.com> wrote:

Hi Murray,

I've pushed a patch onto Git. Do you want to do another Git pull on the latest source code? This should fix the bug with a bad memory allocation on the migration matrices.

Let me know how this works out. Also, I notice that sometimes with different simulation seeds it works. This is because you have small number of sampled chromosomes, and we run out of chromosomes to pick as we go up the ARG. If you don't want to increase the migration rates, you can maintain your demographic model by scaling up the number in -I and the total haplotypes.

Gary

On 6/30/15 7:30 PM, Murray Cox wrote:

Hi Gary,

I chatted with you about this a while back, but I'm just following up to say that I think MaCS still isn't quite doing split/join events right.

I'm trying to model quite a complex situation with eight populations, but I've managed to isolate the problem down to the following simple example:

Imagine we have just two populations, 1 and 2. At a given time, we split population 1 into two new populations, which under ms rules should be called populations 1 and 3. Slightly later, we split population 2 into two new populations, which should be called populations 2 and 4. We then join these populations together: first 4 to 1, then 3 to 2, then 2 to 1.

There's nothing very complex in this setup. An example command line would be:

macs 120 1e8 -t 4.2e-05 -r 0.00042 -h 1e3 -I 2 60 60 10 -es 0.00388 1 0.741 -es 0.00391 2 0.741 -ej 0.00394 4 1 -ej 0.00395 3 2 -ej 0.00396 2 1 2>trees.txt | msformatter > test.out

However, this fails with the following error:

Graph Builder begin Attempting to add edge of population 3 when the number of available population edge pools is only 2. It is recommended that you increase the migration rates and/or number of sampled chromosomes. Graphbuilder destructor Simulator caught exception with message: Data structure integrity error. Simulator destructor: Configuration destructor Program is complete

In my tests, raising the migration rate and sample size doesn't help - even if I raise them to strange levels (e.g., setting the sample size larger than the population size). MaCS either throws the same error, or a different one:

Infinite coalescent time. No migration

This also doesn't make much sense to me, as all populations are quickly joined together into a single population, whose individuals should then rapidly coalesce.

Can you possibly take a look at this and see what's going on?

-Murray

— Reply to this email directly or view it on GitHub https://github.com/gchen98/macs/issues/31 <https://github.com/gchen98/macs/issues/31>.

— Reply to this email directly or view it on GitHub

https://github.com/gchen98/macs/issues/31#issuecomment-117832965 <https://github.com/gchen98/macs/issues/31#issuecomment-117832965>.

— Reply to this email directly or view it on GitHub

https://github.com/gchen98/macs/issues/31#issuecomment-117936190 <https://github.com/gchen98/macs/issues/31#issuecomment-117936190>.

— Reply to this email directly or view it on GitHub https://github.com/gchen98/macs/issues/31#issuecomment-118087579 <https://github.com/gchen98/macs/issues/31#issuecomment-118087579>.

— Reply to this email directly or view it on GitHub https://github.com/gchen98/macs/issues/31#issuecomment-118198864 <https://github.com/gchen98/macs/issues/31#issuecomment-118198864>.

— Reply to this email directly or view it on GitHub https://github.com/gchen98/macs/issues/31#issuecomment-118540211.

— Reply to this email directly or view it on GitHub https://github.com/gchen98/macs/issues/31#issuecomment-118555983.

— Reply to this email directly or view it on GitHub https://github.com/gchen98/macs/issues/31#issuecomment-118570350.

— Reply to this email directly or view it on GitHub https://github.com/gchen98/macs/issues/31#issuecomment-118573147.

mpcox commented 9 years ago

Hi Gary,

This is fantastic. Thanks again for looking into this! I really appreciate your help on this.

I've done some very quick tests this afternoon, and can confirm that the new code does seem to fix the problem. I'm hoping to run a much more extensive sweep of tests tomorrow. I'll let you know what I find.

Also, thanks for the tip on migration. In this case, the default behavior happens to be what I'm after, but it's much better practice to set the migration rate explicitly. (I was lucky as I actually didn't realize what it was doing). I'll change my code to be more explicit on this point now.

More shortly...

-Murray

On 6/07/2015, at 11:27 am, Gary K. Chen notifications@github.com wrote:

Murray I've made a bunch of changes and pushed them onto Git. I ensured that pop split events happened at the right time (they were happening a little out of sequence). It looks like your latter 8 pop system with 60 haps per pop seems to work on my machine.

Some tips:

Right after the birth of a new population (i.e. -es event), you might want to put in a -eM or related migration change event at some epsilon (to maintain ordering) after the -es event to make sure the population members move about. For example if -es is at time 0.0001 you can make -eM time 0.0001000001. By default the entry in the migration matrix for the new pop is of the value of the last entry of the -I flag which may or may not be your intent.

Also, adding the -d debugging flag can help you see what's going in case of crashes.

Gary

On 07/04/2015 08:03 PM, Murray Cox wrote:

Hi Gary,

Hey, that's fantastic!

I now have a large set of test cases that I can run any modified code on. Just let me know how I can help.

-Murray

On 5/07/2015, at 2:07 pm, Gary K. Chen notifications@github.com wrote:

I think I've isolated the bug and will work on it when I get some more time, stay tuned.

On 07/04/2015 02:15 PM, Murray Cox wrote:

Hi Gary,

A further thought.

If I had to generalize the results of all my testing, I think I would say this:

MaCS works fine with any number of join events

MaCS also works fine with one split event

However, MaCS throws the 'edge' error whenever I have 2 or more split events

This latter point can be fixed with high s and m, but only for the very simplest models.

I don't know if this is helpful, but perhaps it might give you some thoughts on where to start looking for the underlying issue?

-Murray

Hi Gary,

Yes, that's right. Even the combination of raising sample size and migration fails.

I can get the original test example (in the email chain below) to work if I raise the sample size.

However, with the 8 population model, MaCS still dies with the 'out of edges' error, even if I raise the sample size in magnitudes (60 -> 600 -> 6000 per population). The same holds true if I raise the migration rate in magnitudes (420 -> 4200). It also still fails if these values are both raised in the same simulation (e.g., 6000 with 4200). To put this in perspective, 4200 equates to ~10% of people moving per generation, so Nm >> 1. At these levels, the 8 populations should actually be acting as one big panmictic population.

For fun, I reduced the real 8 population problem down to just 2 populations. (In other settings, pairwise models have been very powerful in population genetics). Since I could get the test example working (and it contains just two populations), I hoped I could get this real 2 population model working. The wrapper script to make the MaCS command lines is attached.

However, this still fails with the same edge error. In the end, I was trying a sample size of 20,000 per population (far bigger than the actual size) and a migration rate of 4200 (10% of people moving per generation, again far bigger than reality). This gave the edges error too:

./macs 40000 1e8 -t 4.2e-05 -r 0.00042 -h 1e3 -F SNP_Ascertainment_16Sep13.txt 0 -I 2 20000 20000 0 -n 1 0.0171428571428571 -n 2 0.0171428571428571 -m 1 2 4200 -m 2 1 4200 -es 0.00388095238095238 1 0.706506157269033 -en 0.00388105238095238 1 0.195238095238095 -en 0.00388115238095238 3 0.0761904761904762 -es 0.00388125238095238 2 0.73709411917515 -ej 0.00388145238095238 2 1 -ej 0.00388155238095238 4 3 -eM 0.00388185238095238 0 -ej 0.0476190476190476 3 1 -en 0.0476192476190476 1 1

I stripped out all the extraneous commands (like the -F flag) and played around with the -h flag, just in case this mattered. However, there was no difference in the outcome.

I hope this is helpful for figuring out what might be going on. Although I have a reasonable theoretical understanding of the SMC, I don't know the details of the implementation (even after looking at your code). As another potential source of error, the command lines get complex fast, but I've checked them over by hand several times and they do seem to be right. If you notice anything wrong though, definitely let me know...!

-Murray

On 5/07/2015, at 6:27 am, Gary K. Chen <notifications@github.com mailto:notifications@github.com> wrote:

Hi Murray,

So what you're saying is that with the original model you sent me, you can increase the sample size and things work, but with the alternative population model you can't even increase the sample size? Have you tried just for debugging, increasing both sample size and migration rate?

Gary

On 07/02/2015 05:12 PM, Murray Cox wrote:

Hi Gary,

Congratulations on the new job! I didn't realize you were leaving USC.

Definitely happy to help with this. I've had a number of students move on recently, so as fair warning, the experience level of my group has dipped quite a bit. However, we are actively using MaCS, which is a lovely piece of software. We can definitely keep identifying (what we think are) bugs, and if possible, I'll always try to point out where I think they may be creeping in.

One general question: can you possibly explain (or point me towards an explanation of) the 'running out of chromosomes' issue? We've been doing a lot of simulations lately that study small populations (i.e., Ne of perhaps hundreds to low thousands) typically with split/merge histories. We're increasingly finding that our models in this space fail, but we can't really increase migration or the number of sampled chromosomes without deviating from the actual biology of the system. Is this potentially a solvable problem, or do you think that some models just can't be simulated within a sequential Markovian coalescent framework?

I'll send you a follow up email with a real example to clarify...

-Murray

On 3/07/2015, at 4:33 am, gchen98 <notifications@github.com mailto:notifications@github.com> wrote:

Hi Murray,

Just for your info, yesterday was a fortunate day for you in terms of fixing the bug. It was the first day of my new job in industry, so I'm still getting acclimated and had a little time on my hands. My time will be more limited than before moving forward. As this is open source, it would be great if you or someone on your team can help share the burden and spot bugs. This could expedite turnaround.

Gary

On 7/2/15 12:01 AM, Murray Cox wrote:

Hi Gary,

Thanks for getting on to this so quickly!

I'm at a two-day meeting and so haven't been able to test your code fix, but I'll put this right up my priority list. I'm really keen to get this working!

More feedback shortly...

-Murray

On 2/07/2015, at 9:43 am, gchen98 <notifications@github.com mailto:notifications@github.com> wrote:

Hi Murray,

I've pushed a patch onto Git. Do you want to do another Git pull on the latest source code? This should fix the bug with a bad memory allocation on the migration matrices.

Let me know how this works out. Also, I notice that sometimes with different simulation seeds it works. This is because you have small number of sampled chromosomes, and we run out of chromosomes to pick as we go up the ARG. If you don't want to increase the migration rates, you can maintain your demographic model by scaling up the number in -I and the total haplotypes.

Gary

On 6/30/15 7:30 PM, Murray Cox wrote:

Hi Gary,

I chatted with you about this a while back, but I'm just following up to say that I think MaCS still isn't quite doing split/join events right.

I'm trying to model quite a complex situation with eight populations, but I've managed to isolate the problem down to the following simple example:

Imagine we have just two populations, 1 and 2. At a given time, we split population 1 into two new populations, which under ms rules should be called populations 1 and 3. Slightly later, we split population 2 into two new populations, which should be called populations 2 and 4. We then join these populations together: first 4 to 1, then 3 to 2, then 2 to 1.

There's nothing very complex in this setup. An example command line would be:

macs 120 1e8 -t 4.2e-05 -r 0.00042 -h 1e3 -I 2 60 60 10 -es 0.00388 1 0.741 -es 0.00391 2 0.741 -ej 0.00394 4 1 -ej 0.00395 3 2 -ej 0.00396 2 1 2>trees.txt | msformatter > test.out

However, this fails with the following error:

Graph Builder begin Attempting to add edge of population 3 when the number of available population edge pools is only 2. It is recommended that you increase the migration rates and/or number of sampled chromosomes. Graphbuilder destructor Simulator caught exception with message: Data structure integrity error. Simulator destructor: Configuration destructor Program is complete

In my tests, raising the migration rate and sample size doesn't help - even if I raise them to strange levels (e.g., setting the sample size larger than the population size). MaCS either throws the same error, or a different one:

Infinite coalescent time. No migration

This also doesn't make much sense to me, as all populations are quickly joined together into a single population, whose individuals should then rapidly coalesce.

Can you possibly take a look at this and see what's going on?

-Murray

— Reply to this email directly or view it on GitHub https://github.com/gchen98/macs/issues/31 <https://github.com/gchen98/macs/issues/31>.

— Reply to this email directly or view it on GitHub

https://github.com/gchen98/macs/issues/31#issuecomment-117832965 <https://github.com/gchen98/macs/issues/31#issuecomment-117832965>.

— Reply to this email directly or view it on GitHub

https://github.com/gchen98/macs/issues/31#issuecomment-117936190 <https://github.com/gchen98/macs/issues/31#issuecomment-117936190>.

— Reply to this email directly or view it on GitHub https://github.com/gchen98/macs/issues/31#issuecomment-118087579 <https://github.com/gchen98/macs/issues/31#issuecomment-118087579>.

— Reply to this email directly or view it on GitHub https://github.com/gchen98/macs/issues/31#issuecomment-118198864 <https://github.com/gchen98/macs/issues/31#issuecomment-118198864>.

— Reply to this email directly or view it on GitHub https://github.com/gchen98/macs/issues/31#issuecomment-118540211.

— Reply to this email directly or view it on GitHub https://github.com/gchen98/macs/issues/31#issuecomment-118555983.

— Reply to this email directly or view it on GitHub https://github.com/gchen98/macs/issues/31#issuecomment-118570350.

— Reply to this email directly or view it on GitHub https://github.com/gchen98/macs/issues/31#issuecomment-118573147.

— Reply to this email directly or view it on GitHub https://github.com/gchen98/macs/issues/31#issuecomment-118676470.

mpcox commented 9 years ago

Hi Gary,

Just a quick follow up to say that far more extensive testing today seems to show that you've fixed the problem.

I'm now running some long parameter sweeps, which will provide another check, but all the results I've seen so far suggest that MaCS is now working as expected.

Thanks again for jumping on to this so quickly. I really appreciate your help! And as I mentioned earlier, I expect that my group will be using MaCS for some time to come, so we're perfectly happy to act as testers as we're running our analyses.

Thanks again!

-Murray

On 6/07/2015, at 11:27 am, Gary K. Chen notifications@github.com wrote:

Murray I've made a bunch of changes and pushed them onto Git. I ensured that pop split events happened at the right time (they were happening a little out of sequence). It looks like your latter 8 pop system with 60 haps per pop seems to work on my machine.

Some tips:

Right after the birth of a new population (i.e. -es event), you might want to put in a -eM or related migration change event at some epsilon (to maintain ordering) after the -es event to make sure the population members move about. For example if -es is at time 0.0001 you can make -eM time 0.0001000001. By default the entry in the migration matrix for the new pop is of the value of the last entry of the -I flag which may or may not be your intent.

Also, adding the -d debugging flag can help you see what's going in case of crashes.

Gary

On 07/04/2015 08:03 PM, Murray Cox wrote:

Hi Gary,

Hey, that's fantastic!

I now have a large set of test cases that I can run any modified code on. Just let me know how I can help.

-Murray

On 5/07/2015, at 2:07 pm, Gary K. Chen notifications@github.com wrote:

I think I've isolated the bug and will work on it when I get some more time, stay tuned.

On 07/04/2015 02:15 PM, Murray Cox wrote:

Hi Gary,

A further thought.

If I had to generalize the results of all my testing, I think I would say this:

MaCS works fine with any number of join events

MaCS also works fine with one split event

However, MaCS throws the 'edge' error whenever I have 2 or more split events

This latter point can be fixed with high s and m, but only for the very simplest models.

I don't know if this is helpful, but perhaps it might give you some thoughts on where to start looking for the underlying issue?

-Murray

Hi Gary,

Yes, that's right. Even the combination of raising sample size and migration fails.

I can get the original test example (in the email chain below) to work if I raise the sample size.

However, with the 8 population model, MaCS still dies with the 'out of edges' error, even if I raise the sample size in magnitudes (60 -> 600 -> 6000 per population). The same holds true if I raise the migration rate in magnitudes (420 -> 4200). It also still fails if these values are both raised in the same simulation (e.g., 6000 with 4200). To put this in perspective, 4200 equates to ~10% of people moving per generation, so Nm >> 1. At these levels, the 8 populations should actually be acting as one big panmictic population.

For fun, I reduced the real 8 population problem down to just 2 populations. (In other settings, pairwise models have been very powerful in population genetics). Since I could get the test example working (and it contains just two populations), I hoped I could get this real 2 population model working. The wrapper script to make the MaCS command lines is attached.

However, this still fails with the same edge error. In the end, I was trying a sample size of 20,000 per population (far bigger than the actual size) and a migration rate of 4200 (10% of people moving per generation, again far bigger than reality). This gave the edges error too:

./macs 40000 1e8 -t 4.2e-05 -r 0.00042 -h 1e3 -F SNP_Ascertainment_16Sep13.txt 0 -I 2 20000 20000 0 -n 1 0.0171428571428571 -n 2 0.0171428571428571 -m 1 2 4200 -m 2 1 4200 -es 0.00388095238095238 1 0.706506157269033 -en 0.00388105238095238 1 0.195238095238095 -en 0.00388115238095238 3 0.0761904761904762 -es 0.00388125238095238 2 0.73709411917515 -ej 0.00388145238095238 2 1 -ej 0.00388155238095238 4 3 -eM 0.00388185238095238 0 -ej 0.0476190476190476 3 1 -en 0.0476192476190476 1 1

I stripped out all the extraneous commands (like the -F flag) and played around with the -h flag, just in case this mattered. However, there was no difference in the outcome.

I hope this is helpful for figuring out what might be going on. Although I have a reasonable theoretical understanding of the SMC, I don't know the details of the implementation (even after looking at your code). As another potential source of error, the command lines get complex fast, but I've checked them over by hand several times and they do seem to be right. If you notice anything wrong though, definitely let me know...!

-Murray

On 5/07/2015, at 6:27 am, Gary K. Chen <notifications@github.com mailto:notifications@github.com> wrote:

Hi Murray,

So what you're saying is that with the original model you sent me, you can increase the sample size and things work, but with the alternative population model you can't even increase the sample size? Have you tried just for debugging, increasing both sample size and migration rate?

Gary

On 07/02/2015 05:12 PM, Murray Cox wrote:

Hi Gary,

Congratulations on the new job! I didn't realize you were leaving USC.

Definitely happy to help with this. I've had a number of students move on recently, so as fair warning, the experience level of my group has dipped quite a bit. However, we are actively using MaCS, which is a lovely piece of software. We can definitely keep identifying (what we think are) bugs, and if possible, I'll always try to point out where I think they may be creeping in.

One general question: can you possibly explain (or point me towards an explanation of) the 'running out of chromosomes' issue? We've been doing a lot of simulations lately that study small populations (i.e., Ne of perhaps hundreds to low thousands) typically with split/merge histories. We're increasingly finding that our models in this space fail, but we can't really increase migration or the number of sampled chromosomes without deviating from the actual biology of the system. Is this potentially a solvable problem, or do you think that some models just can't be simulated within a sequential Markovian coalescent framework?

I'll send you a follow up email with a real example to clarify...

-Murray

On 3/07/2015, at 4:33 am, gchen98 <notifications@github.com mailto:notifications@github.com> wrote:

Hi Murray,

Just for your info, yesterday was a fortunate day for you in terms of fixing the bug. It was the first day of my new job in industry, so I'm still getting acclimated and had a little time on my hands. My time will be more limited than before moving forward. As this is open source, it would be great if you or someone on your team can help share the burden and spot bugs. This could expedite turnaround.

Gary

On 7/2/15 12:01 AM, Murray Cox wrote:

Hi Gary,

Thanks for getting on to this so quickly!

I'm at a two-day meeting and so haven't been able to test your code fix, but I'll put this right up my priority list. I'm really keen to get this working!

More feedback shortly...

-Murray

On 2/07/2015, at 9:43 am, gchen98 <notifications@github.com mailto:notifications@github.com> wrote:

Hi Murray,

I've pushed a patch onto Git. Do you want to do another Git pull on the latest source code? This should fix the bug with a bad memory allocation on the migration matrices.

Let me know how this works out. Also, I notice that sometimes with different simulation seeds it works. This is because you have small number of sampled chromosomes, and we run out of chromosomes to pick as we go up the ARG. If you don't want to increase the migration rates, you can maintain your demographic model by scaling up the number in -I and the total haplotypes.

Gary

On 6/30/15 7:30 PM, Murray Cox wrote:

Hi Gary,

I chatted with you about this a while back, but I'm just following up to say that I think MaCS still isn't quite doing split/join events right.

I'm trying to model quite a complex situation with eight populations, but I've managed to isolate the problem down to the following simple example:

Imagine we have just two populations, 1 and 2. At a given time, we split population 1 into two new populations, which under ms rules should be called populations 1 and 3. Slightly later, we split population 2 into two new populations, which should be called populations 2 and 4. We then join these populations together: first 4 to 1, then 3 to 2, then 2 to 1.

There's nothing very complex in this setup. An example command line would be:

macs 120 1e8 -t 4.2e-05 -r 0.00042 -h 1e3 -I 2 60 60 10 -es 0.00388 1 0.741 -es 0.00391 2 0.741 -ej 0.00394 4 1 -ej 0.00395 3 2 -ej 0.00396 2 1 2>trees.txt | msformatter > test.out

However, this fails with the following error:

Graph Builder begin Attempting to add edge of population 3 when the number of available population edge pools is only 2. It is recommended that you increase the migration rates and/or number of sampled chromosomes. Graphbuilder destructor Simulator caught exception with message: Data structure integrity error. Simulator destructor: Configuration destructor Program is complete

In my tests, raising the migration rate and sample size doesn't help - even if I raise them to strange levels (e.g., setting the sample size larger than the population size). MaCS either throws the same error, or a different one:

Infinite coalescent time. No migration

This also doesn't make much sense to me, as all populations are quickly joined together into a single population, whose individuals should then rapidly coalesce.

Can you possibly take a look at this and see what's going on?

-Murray

— Reply to this email directly or view it on GitHub https://github.com/gchen98/macs/issues/31 <https://github.com/gchen98/macs/issues/31>.

— Reply to this email directly or view it on GitHub

https://github.com/gchen98/macs/issues/31#issuecomment-117832965 <https://github.com/gchen98/macs/issues/31#issuecomment-117832965>.

— Reply to this email directly or view it on GitHub

https://github.com/gchen98/macs/issues/31#issuecomment-117936190 <https://github.com/gchen98/macs/issues/31#issuecomment-117936190>.

— Reply to this email directly or view it on GitHub https://github.com/gchen98/macs/issues/31#issuecomment-118087579 <https://github.com/gchen98/macs/issues/31#issuecomment-118087579>.

— Reply to this email directly or view it on GitHub https://github.com/gchen98/macs/issues/31#issuecomment-118198864 <https://github.com/gchen98/macs/issues/31#issuecomment-118198864>.

— Reply to this email directly or view it on GitHub https://github.com/gchen98/macs/issues/31#issuecomment-118540211.

— Reply to this email directly or view it on GitHub https://github.com/gchen98/macs/issues/31#issuecomment-118555983.

— Reply to this email directly or view it on GitHub https://github.com/gchen98/macs/issues/31#issuecomment-118570350.

— Reply to this email directly or view it on GitHub https://github.com/gchen98/macs/issues/31#issuecomment-118573147.

— Reply to this email directly or view it on GitHub https://github.com/gchen98/macs/issues/31#issuecomment-118676470.

gchen98 commented 9 years ago

No problem. MaCS is such a scary beast of complexity. It was truly a labor of love from my postdoc days, before grants and children could distract me. I am cautious about making changes these days. Hopefully, these patches didn't open up any new bugs.

Thanks for spotting this.

On 07/06/2015 07:35 PM, Murray Cox wrote:

Hi Gary,

Just a quick follow up to say that far more extensive testing today seems to show that you've fixed the problem.

I'm now running some long parameter sweeps, which will provide another check, but all the results I've seen so far suggest that MaCS is now working as expected.

Thanks again for jumping on to this so quickly. I really appreciate your help! And as I mentioned earlier, I expect that my group will be using MaCS for some time to come, so we're perfectly happy to act as testers as we're running our analyses.

Thanks again!

-Murray

On 6/07/2015, at 11:27 am, Gary K. Chen notifications@github.com wrote:

Murray I've made a bunch of changes and pushed them onto Git. I ensured that pop split events happened at the right time (they were happening a little out of sequence). It looks like your latter 8 pop system with 60 haps per pop seems to work on my machine.

Some tips:

Right after the birth of a new population (i.e. -es event), you might want to put in a -eM or related migration change event at some epsilon (to maintain ordering) after the -es event to make sure the population members move about. For example if -es is at time 0.0001 you can make -eM time 0.0001000001. By default the entry in the migration matrix for the new pop is of the value of the last entry of the -I flag which may or may not be your intent.

Also, adding the -d debugging flag can help you see what's going in case of crashes.

Gary

On 07/04/2015 08:03 PM, Murray Cox wrote:

Hi Gary,

Hey, that's fantastic!

I now have a large set of test cases that I can run any modified code on. Just let me know how I can help.

-Murray

On 5/07/2015, at 2:07 pm, Gary K. Chen notifications@github.com wrote:

I think I've isolated the bug and will work on it when I get some more time, stay tuned.

On 07/04/2015 02:15 PM, Murray Cox wrote:

Hi Gary,

A further thought.

If I had to generalize the results of all my testing, I think I would say this:

MaCS works fine with any number of join events

MaCS also works fine with one split event

However, MaCS throws the 'edge' error whenever I have 2 or more split events

This latter point can be fixed with high s and m, but only for the very simplest models.

I don't know if this is helpful, but perhaps it might give you some thoughts on where to start looking for the underlying issue?

-Murray

Hi Gary,

Yes, that's right. Even the combination of raising sample size and migration fails.

I can get the original test example (in the email chain below) to work if I raise the sample size.

However, with the 8 population model, MaCS still dies with the 'out of edges' error, even if I raise the sample size in magnitudes (60 -> 600 -> 6000 per population). The same holds true if I raise the migration rate in magnitudes (420 -> 4200). It also still fails if these values are both raised in the same simulation (e.g., 6000 with 4200). To put this in perspective, 4200 equates to ~10% of people moving per generation, so Nm >> 1. At these levels, the 8 populations should actually be acting as one big panmictic population.

For fun, I reduced the real 8 population problem down to just 2 populations. (In other settings, pairwise models have been very powerful in population genetics). Since I could get the test example working (and it contains just two populations), I hoped I could get this real 2 population model working. The wrapper script to make the MaCS command lines is attached.

However, this still fails with the same edge error. In the end, I was trying a sample size of 20,000 per population (far bigger than the actual size) and a migration rate of 4200 (10% of people moving per generation, again far bigger than reality). This gave the edges error too:

./macs 40000 1e8 -t 4.2e-05 -r 0.00042 -h 1e3 -F SNP_Ascertainment_16Sep13.txt 0 -I 2 20000 20000 0 -n 1 0.0171428571428571 -n 2 0.0171428571428571 -m 1 2 4200 -m 2 1 4200 -es 0.00388095238095238 1 0.706506157269033 -en 0.00388105238095238 1 0.195238095238095 -en 0.00388115238095238 3 0.0761904761904762 -es 0.00388125238095238 2 0.73709411917515 -ej 0.00388145238095238 2 1 -ej 0.00388155238095238 4 3 -eM 0.00388185238095238 0 -ej 0.0476190476190476 3 1 -en 0.0476192476190476 1 1

I stripped out all the extraneous commands (like the -F flag) and played around with the -h flag, just in case this mattered. However, there was no difference in the outcome.

I hope this is helpful for figuring out what might be going on. Although I have a reasonable theoretical understanding of the SMC, I don't know the details of the implementation (even after looking at your code). As another potential source of error, the command lines get complex fast, but I've checked them over by hand several times and they do seem to be right. If you notice anything wrong though, definitely let me know...!

-Murray

On 5/07/2015, at 6:27 am, Gary K. Chen <notifications@github.com mailto:notifications@github.com> wrote:

Hi Murray,

So what you're saying is that with the original model you sent me, you can increase the sample size and things work, but with the alternative population model you can't even increase the sample size? Have you tried just for debugging, increasing both sample size and migration rate?

Gary

On 07/02/2015 05:12 PM, Murray Cox wrote:

Hi Gary,

Congratulations on the new job! I didn't realize you were leaving USC.

Definitely happy to help with this. I've had a number of students move on recently, so as fair warning, the experience level of my group has dipped quite a bit. However, we are actively using MaCS, which is a lovely piece of software. We can definitely keep identifying (what we think are) bugs, and if possible, I'll always try to point out where I think they may be creeping in.

One general question: can you possibly explain (or point me towards an explanation of) the 'running out of chromosomes' issue? We've been doing a lot of simulations lately that study small populations (i.e., Ne of perhaps hundreds to low thousands) typically with split/merge histories. We're increasingly finding that our models in this space fail, but we can't really increase migration or the number of sampled chromosomes without deviating from the actual biology of the system. Is this potentially a solvable problem, or do you think that some models just can't be simulated within a sequential Markovian coalescent framework?

I'll send you a follow up email with a real example to clarify...

-Murray

On 3/07/2015, at 4:33 am, gchen98 <notifications@github.com mailto:notifications@github.com> wrote:

Hi Murray,

Just for your info, yesterday was a fortunate day for you in terms of fixing the bug. It was the first day of my new job in industry, so I'm still getting acclimated and had a little time on my hands. My time will be more limited than before moving forward. As this is open source, it would be great if you or someone on your team can help share the burden and spot bugs. This could expedite turnaround.

Gary

On 7/2/15 12:01 AM, Murray Cox wrote:

Hi Gary,

Thanks for getting on to this so quickly!

I'm at a two-day meeting and so haven't been able to test your code fix, but I'll put this right up my priority list. I'm really keen to get this working!

More feedback shortly...

-Murray

On 2/07/2015, at 9:43 am, gchen98 <notifications@github.com mailto:notifications@github.com> wrote:

Hi Murray,

I've pushed a patch onto Git. Do you want to do another Git pull on the latest source code? This should fix the bug with a bad memory allocation on the migration matrices.

Let me know how this works out. Also, I notice that sometimes with different simulation seeds it works. This is because you have small number of sampled chromosomes, and we run out of chromosomes to pick as we go up the ARG. If you don't want to increase the migration rates, you can maintain your demographic model by scaling up the number in -I and the total haplotypes.

Gary

On 6/30/15 7:30 PM, Murray Cox wrote:

Hi Gary,

I chatted with you about this a while back, but I'm just following up to say that I think MaCS still isn't quite doing split/join events right.

I'm trying to model quite a complex situation with eight populations, but I've managed to isolate the problem down to the following simple example:

Imagine we have just two populations, 1 and 2. At a given time, we split population 1 into two new populations, which under ms rules should be called populations 1 and 3. Slightly later, we split population 2 into two new populations, which should be called populations 2 and 4. We then join these populations together: first 4 to 1, then 3 to 2, then 2 to 1.

There's nothing very complex in this setup. An example command line would be:

macs 120 1e8 -t 4.2e-05 -r 0.00042 -h 1e3 -I 2 60 60 10 -es 0.00388 1 0.741 -es 0.00391 2 0.741 -ej 0.00394 4 1 -ej 0.00395 3 2 -ej 0.00396 2 1 2>trees.txt | msformatter > test.out

However, this fails with the following error:

Graph Builder begin Attempting to add edge of population 3 when the number of available population edge pools is only 2. It is recommended that you increase the migration rates and/or number of sampled chromosomes. Graphbuilder destructor Simulator caught exception with message: Data structure integrity error. Simulator destructor: Configuration destructor Program is complete

In my tests, raising the migration rate and sample size doesn't help - even if I raise them to strange levels (e.g., setting the sample size larger than the population size). MaCS either throws the same error, or a different one:

Infinite coalescent time. No migration

This also doesn't make much sense to me, as all populations are quickly joined together into a single population, whose individuals should then rapidly coalesce.

Can you possibly take a look at this and see what's going on?

-Murray

— Reply to this email directly or view it on GitHub https://github.com/gchen98/macs/issues/31 <https://github.com/gchen98/macs/issues/31>.

— Reply to this email directly or view it on GitHub

<https://github.com/gchen98/macs/issues/31#issuecomment-117832965

https://github.com/gchen98/macs/issues/31#issuecomment-117832965>.

— Reply to this email directly or view it on GitHub

<https://github.com/gchen98/macs/issues/31#issuecomment-117936190

https://github.com/gchen98/macs/issues/31#issuecomment-117936190>.

— Reply to this email directly or view it on GitHub

<https://github.com/gchen98/macs/issues/31#issuecomment-118087579

https://github.com/gchen98/macs/issues/31#issuecomment-118087579>.

— Reply to this email directly or view it on GitHub

<https://github.com/gchen98/macs/issues/31#issuecomment-118198864

https://github.com/gchen98/macs/issues/31#issuecomment-118198864>.

— Reply to this email directly or view it on GitHub https://github.com/gchen98/macs/issues/31#issuecomment-118540211.

— Reply to this email directly or view it on GitHub https://github.com/gchen98/macs/issues/31#issuecomment-118555983.

— Reply to this email directly or view it on GitHub https://github.com/gchen98/macs/issues/31#issuecomment-118570350.

— Reply to this email directly or view it on GitHub https://github.com/gchen98/macs/issues/31#issuecomment-118573147.

— Reply to this email directly or view it on GitHub https://github.com/gchen98/macs/issues/31#issuecomment-118676470.

— Reply to this email directly or view it on GitHub https://github.com/gchen98/macs/issues/31#issuecomment-119051954.

mpcox commented 9 years ago

Ah, grants and children. I totally understand!

The complexity of MaCS is the main reason I contacted you, rather than delving into the code myself. I doubt I would have found the source of this problem. MaCS really is a lovely program though. Inference from whole genome data seems a little slow to take off, but whole genome simulators are going to become ever more important going forward. MaCS is much more flexible than the other ones I've tried and it has become a bit of a workhorse in my group.

Thanks again for your help with everything. Really appreciate it!

-Murray

On 7/07/2015, at 3:13 pm, Gary K. Chen notifications@github.com wrote:

No problem. MaCS is such a scary beast of complexity. It was truly a labor of love from my postdoc days, before grants and children could distract me. I am cautious about making changes these days. Hopefully, these patches didn't open up any new bugs.

Thanks for spotting this.

On 07/06/2015 07:35 PM, Murray Cox wrote:

Hi Gary,

Just a quick follow up to say that far more extensive testing today seems to show that you've fixed the problem.

I'm now running some long parameter sweeps, which will provide another check, but all the results I've seen so far suggest that MaCS is now working as expected.

Thanks again for jumping on to this so quickly. I really appreciate your help! And as I mentioned earlier, I expect that my group will be using MaCS for some time to come, so we're perfectly happy to act as testers as we're running our analyses.

Thanks again!

-Murray

On 6/07/2015, at 11:27 am, Gary K. Chen notifications@github.com wrote:

Murray I've made a bunch of changes and pushed them onto Git. I ensured that pop split events happened at the right time (they were happening a little out of sequence). It looks like your latter 8 pop system with 60 haps per pop seems to work on my machine.

Some tips:

Right after the birth of a new population (i.e. -es event), you might want to put in a -eM or related migration change event at some epsilon (to maintain ordering) after the -es event to make sure the population members move about. For example if -es is at time 0.0001 you can make -eM time 0.0001000001. By default the entry in the migration matrix for the new pop is of the value of the last entry of the -I flag which may or may not be your intent.

Also, adding the -d debugging flag can help you see what's going in case of crashes.

Gary

On 07/04/2015 08:03 PM, Murray Cox wrote:

Hi Gary,

Hey, that's fantastic!

I now have a large set of test cases that I can run any modified code on. Just let me know how I can help.

-Murray

On 5/07/2015, at 2:07 pm, Gary K. Chen notifications@github.com wrote:

I think I've isolated the bug and will work on it when I get some more time, stay tuned.

On 07/04/2015 02:15 PM, Murray Cox wrote:

Hi Gary,

A further thought.

If I had to generalize the results of all my testing, I think I would say this:

MaCS works fine with any number of join events

MaCS also works fine with one split event

However, MaCS throws the 'edge' error whenever I have 2 or more split events

This latter point can be fixed with high s and m, but only for the very simplest models.

I don't know if this is helpful, but perhaps it might give you some thoughts on where to start looking for the underlying issue?

-Murray

Hi Gary,

Yes, that's right. Even the combination of raising sample size and migration fails.

I can get the original test example (in the email chain below) to work if I raise the sample size.

However, with the 8 population model, MaCS still dies with the 'out of edges' error, even if I raise the sample size in magnitudes (60 -> 600 -> 6000 per population). The same holds true if I raise the migration rate in magnitudes (420 -> 4200). It also still fails if these values are both raised in the same simulation (e.g., 6000 with 4200). To put this in perspective, 4200 equates to ~10% of people moving per generation, so Nm >> 1. At these levels, the 8 populations should actually be acting as one big panmictic population.

For fun, I reduced the real 8 population problem down to just 2 populations. (In other settings, pairwise models have been very powerful in population genetics). Since I could get the test example working (and it contains just two populations), I hoped I could get this real 2 population model working. The wrapper script to make the MaCS command lines is attached.

However, this still fails with the same edge error. In the end, I was trying a sample size of 20,000 per population (far bigger than the actual size) and a migration rate of 4200 (10% of people moving per generation, again far bigger than reality). This gave the edges error too:

./macs 40000 1e8 -t 4.2e-05 -r 0.00042 -h 1e3 -F SNP_Ascertainment_16Sep13.txt 0 -I 2 20000 20000 0 -n 1 0.0171428571428571 -n 2 0.0171428571428571 -m 1 2 4200 -m 2 1 4200 -es 0.00388095238095238 1 0.706506157269033 -en 0.00388105238095238 1 0.195238095238095 -en 0.00388115238095238 3 0.0761904761904762 -es 0.00388125238095238 2 0.73709411917515 -ej 0.00388145238095238 2 1 -ej 0.00388155238095238 4 3 -eM 0.00388185238095238 0 -ej 0.0476190476190476 3 1 -en 0.0476192476190476 1 1

I stripped out all the extraneous commands (like the -F flag) and played around with the -h flag, just in case this mattered. However, there was no difference in the outcome.

I hope this is helpful for figuring out what might be going on. Although I have a reasonable theoretical understanding of the SMC, I don't know the details of the implementation (even after looking at your code). As another potential source of error, the command lines get complex fast, but I've checked them over by hand several times and they do seem to be right. If you notice anything wrong though, definitely let me know...!

-Murray

On 5/07/2015, at 6:27 am, Gary K. Chen <notifications@github.com mailto:notifications@github.com> wrote:

Hi Murray,

So what you're saying is that with the original model you sent me, you can increase the sample size and things work, but with the alternative population model you can't even increase the sample size? Have you tried just for debugging, increasing both sample size and migration rate?

Gary

On 07/02/2015 05:12 PM, Murray Cox wrote:

Hi Gary,

Congratulations on the new job! I didn't realize you were leaving USC.

Definitely happy to help with this. I've had a number of students move on recently, so as fair warning, the experience level of my group has dipped quite a bit. However, we are actively using MaCS, which is a lovely piece of software. We can definitely keep identifying (what we think are) bugs, and if possible, I'll always try to point out where I think they may be creeping in.

One general question: can you possibly explain (or point me towards an explanation of) the 'running out of chromosomes' issue? We've been doing a lot of simulations lately that study small populations (i.e., Ne of perhaps hundreds to low thousands) typically with split/merge histories. We're increasingly finding that our models in this space fail, but we can't really increase migration or the number of sampled chromosomes without deviating from the actual biology of the system. Is this potentially a solvable problem, or do you think that some models just can't be simulated within a sequential Markovian coalescent framework?

I'll send you a follow up email with a real example to clarify...

-Murray

On 3/07/2015, at 4:33 am, gchen98 <notifications@github.com mailto:notifications@github.com> wrote:

Hi Murray,

Just for your info, yesterday was a fortunate day for you in terms of fixing the bug. It was the first day of my new job in industry, so I'm still getting acclimated and had a little time on my hands. My time will be more limited than before moving forward. As this is open source, it would be great if you or someone on your team can help share the burden and spot bugs. This could expedite turnaround.

Gary

On 7/2/15 12:01 AM, Murray Cox wrote:

Hi Gary,

Thanks for getting on to this so quickly!

I'm at a two-day meeting and so haven't been able to test your code fix, but I'll put this right up my priority list. I'm really keen to get this working!

More feedback shortly...

-Murray

On 2/07/2015, at 9:43 am, gchen98 <notifications@github.com mailto:notifications@github.com> wrote:

Hi Murray,

I've pushed a patch onto Git. Do you want to do another Git pull on the latest source code? This should fix the bug with a bad memory allocation on the migration matrices.

Let me know how this works out. Also, I notice that sometimes with different simulation seeds it works. This is because you have small number of sampled chromosomes, and we run out of chromosomes to pick as we go up the ARG. If you don't want to increase the migration rates, you can maintain your demographic model by scaling up the number in -I and the total haplotypes.

Gary

On 6/30/15 7:30 PM, Murray Cox wrote:

Hi Gary,

I chatted with you about this a while back, but I'm just following up to say that I think MaCS still isn't quite doing split/join events right.

I'm trying to model quite a complex situation with eight populations, but I've managed to isolate the problem down to the following simple example:

Imagine we have just two populations, 1 and 2. At a given time, we split population 1 into two new populations, which under ms rules should be called populations 1 and 3. Slightly later, we split population 2 into two new populations, which should be called populations 2 and 4. We then join these populations together: first 4 to 1, then 3 to 2, then 2 to 1.

There's nothing very complex in this setup. An example command line would be:

macs 120 1e8 -t 4.2e-05 -r 0.00042 -h 1e3 -I 2 60 60 10 -es 0.00388 1 0.741 -es 0.00391 2 0.741 -ej 0.00394 4 1 -ej 0.00395 3 2 -ej 0.00396 2 1 2>trees.txt | msformatter > test.out

However, this fails with the following error:

Graph Builder begin Attempting to add edge of population 3 when the number of available population edge pools is only 2. It is recommended that you increase the migration rates and/or number of sampled chromosomes. Graphbuilder destructor Simulator caught exception with message: Data structure integrity error. Simulator destructor: Configuration destructor Program is complete

In my tests, raising the migration rate and sample size doesn't help - even if I raise them to strange levels (e.g., setting the sample size larger than the population size). MaCS either throws the same error, or a different one:

Infinite coalescent time. No migration

This also doesn't make much sense to me, as all populations are quickly joined together into a single population, whose individuals should then rapidly coalesce.

Can you possibly take a look at this and see what's going on?

-Murray

— Reply to this email directly or view it on GitHub https://github.com/gchen98/macs/issues/31 <https://github.com/gchen98/macs/issues/31>.

— Reply to this email directly or view it on GitHub

<https://github.com/gchen98/macs/issues/31#issuecomment-117832965

https://github.com/gchen98/macs/issues/31#issuecomment-117832965>.

— Reply to this email directly or view it on GitHub

<https://github.com/gchen98/macs/issues/31#issuecomment-117936190

https://github.com/gchen98/macs/issues/31#issuecomment-117936190>.

— Reply to this email directly or view it on GitHub

<https://github.com/gchen98/macs/issues/31#issuecomment-118087579

https://github.com/gchen98/macs/issues/31#issuecomment-118087579>.

— Reply to this email directly or view it on GitHub

<https://github.com/gchen98/macs/issues/31#issuecomment-118198864

https://github.com/gchen98/macs/issues/31#issuecomment-118198864>.

— Reply to this email directly or view it on GitHub https://github.com/gchen98/macs/issues/31#issuecomment-118540211.

— Reply to this email directly or view it on GitHub https://github.com/gchen98/macs/issues/31#issuecomment-118555983.

— Reply to this email directly or view it on GitHub https://github.com/gchen98/macs/issues/31#issuecomment-118570350.

— Reply to this email directly or view it on GitHub https://github.com/gchen98/macs/issues/31#issuecomment-118573147.

— Reply to this email directly or view it on GitHub https://github.com/gchen98/macs/issues/31#issuecomment-118676470.

— Reply to this email directly or view it on GitHub https://github.com/gchen98/macs/issues/31#issuecomment-119051954.

— Reply to this email directly or view it on GitHub https://github.com/gchen98/macs/issues/31#issuecomment-119057857.