gchen98 / macs

Automatically exported from code.google.com/p/macs
16 stars 6 forks source link

memory crash when modeling admixture #1

Closed GoogleCodeExporter closed 9 years ago

GoogleCodeExporter commented 9 years ago
From:

Katie Cunningham
Gutengroup
University of Arizona

Dear Dr. Chen,

I am trying to simulate the following model for chromosome-sized lengths in 
MaCS:

    |    |    t = 0.7
   / /|  |
  | | | 1|
  |1| |  |
  | | |  \
  | | |   \   t = 0.07
  | | | |\ \
  | | |.| |.|
  | | |5| |5|
   \ \| | | |
    --, | | | f = <admixture>%, t = 0.0035
      | | | |
      | | | |
       B   Y

I have created what I believe is the corresponding MaCS core code:
-I 2 8 18 -n 1 0.5 -n 2 0.5  -es 0.0035 1 <admixture> -en 0.0035 3 1.0  -ej 
0.07 2 1 -en 0.07 1 1.0  -ej 0.7 3 1 -en 0.7 1 1.0

However, MaCS crashes with a memory error for the following input parameters:
macs 26 <length> -i <iterations> -t 0.0008 -r 0.0004 -h 1e5 -I 2 8 18 -n 1 0.5 
-n 2 0.5  -es 0.0035 1 <admixture> -en 0.0035 3 1.0  -ej 0.07 2 1 -en 0.07 1 
1.0  -ej 0.7 3 1 -en 0.7 1 1.0

The crash appears to be influenced by the randomness in the simulation, and it 
is more or less likely to occur with different parameter values.
The crash is more likely for longer length, more iterations, and lower 
admixture proportion. 

As an example, the crash happens very often for length = 1e7, admixture = 0.3, 
and iterations = 1.
macs 26 1e7 -i 1 -t 0.0008 -r 0.0004 -h 1e5 -I 2 8 18 -n 1 0.5 -n 2 0.5  -es 
0.0035 1 0.3 -en 0.0035 3 1.0  -ej 0.07 2 1 -en 0.07 1 1.0  -ej 0.7 3 1 -en 0.7 
1 1.0

An example of the most typical error message is attached, obtained from an 
Intel Core i7 (2.80 GHz, 8Gb RAM) running Ubuntu.

Do you know of any way around this error, to obtain simulations of the model 
above?

In addition, I'd like to bring to your attention to the fact that msformatter 
prints the last token on command line twice when it writes the first line of 
the output file.

Thank you,
Katie Cunningham
Gutengroup
University of Arizona

Here is a command line that crashes, with seed 1. Admixture proportion
is 0.1.

macs 26 1e7 -i 1 -s 1 -t 0.0008 -r 0.0004 -h 1e5 -I 2 8 18 -n 1 0.5 -n 2
0.5  -es 0.0035 1 0.1 -en 0.0035 3 1.0  -ej 0.07 2 1 -en 0.07 1 1.0  -ej
0.7 3 1 -en 0.7 1 1.0 

Original issue reported on code.google.com by gche...@gmail.com on 6 Jul 2012 at 6:53

GoogleCodeExporter commented 9 years ago
Katie,

After some tedious debugging, I think I know what is happening.  You are 
working with some pretty small sample sizes (8 and 18).  What happens is that 
(as you might know already) the ARG is initialized at the very left of the 
region based on the topology you drew below.  The admixture event at .0035 with 
proportion .1 is simulated as a migration event, with perhaps as little as one 
chromosome migrating.   As the algorithm moves from left to right, pruning off 
edges, at some point that emigrant was killed off, leaving no one in 
"population 3".  Hence, the program rather ungracefully aborted.

As an experiment, I increased your sample sizes by ten fold (i.e. 260,80,180) 
and it completed just fine.  I don't know if this is a possibility for you.  
Otherwise, I think the algorithm is just too vulnerable to such small sample 
sizes.

Gary 

Original comment by gche...@gmail.com on 6 Jul 2012 at 7:00