gchen98 / macs

Automatically exported from code.google.com/p/macs
16 stars 6 forks source link

is there anything wrong with -ej/es options? #12

Closed GoogleCodeExporter closed 9 years ago

GoogleCodeExporter commented 9 years ago
What steps will reproduce the problem?
./macs 2 100000 -t 0.001 -r 0.001 -es 0.01 1 0.05 -ej 0.1 2 1

What is the expected output? What do you see instead?
COMMAND:    ./macs 2 100000 -t 0.001 -r 0.001 -es 0.01 1 0.05 -ej 0.1 2 1
INPUT: Sample size is now 2
INPUT: Seq length is now 100000
INPUT: Scaled mutation rate is now 100
INPUT: Scaled recombination rate is now 100
INPUT: At time 0.01: Population 1 splits at proportion 0.05
INPUT: At time 0.1: WARNING: The pop IDs used in pop join is greater than the 
number specified in -I.  You must have a split event before this join event.
Population 2 will merge with 1
SEED:    1361513204
Debugging: 0

Graph Builder begin
Time for build prior tree: 0 seconds
Tree:0,pos:0,len:1.60643,TMRCA:0.803213,ARG:,len:1.60643,TMRCA:0.803213
Source pop and total pops are 1,1 in POP JOIN event at history 1. It is 
recommended that you increase the migration rates and/or number of sampled 
chromosomes.
Graphbuilder destructor
Simulator caught exception with message:
Invalid data structure
Simulator destructor:
Configuration destructor
Program is complete

What version of the product are you using? On what operating system?
latest version, macs-0.5b.tar.gz

Please provide any additional information below.
I have a population, first split and then join,
the program seems to have trouble doing this?

Original issue reported on code.google.com by weiwei.z...@gmail.com on 22 Feb 2013 at 2:25

GoogleCodeExporter commented 9 years ago
You are sampling far too few chromosomes for the split and joins to work. The 
splits require some proportion to migrate, but not an all or nothing deal.  Try 
increasing that value from 2.

Original comment by gche...@gmail.com on 22 Feb 2013 at 2:29

GoogleCodeExporter commented 9 years ago
Thank you for getting back to me.
Does that mean, population 2(newly created population) has to have
at least one ancestral lineage?

If I have the proportion to be low(-es t id proportion, if proportion is low),
I will always have a high probability that pop2 will have zero ancestral 
lineages.

*Also, this proportion seems to be differnt from ms?
in ms, this proportion is the proportion that stays in the original pop(i)
rather than n+i (defined here in macs)

Thank you!

Original comment by weiwei.z...@gmail.com on 22 Feb 2013 at 2:36

GoogleCodeExporter commented 9 years ago
Yes to your first and second question.  In practice we set high numbers for 
sampled chromosomes so that there will still be some chromosomes left for the 
low proportion lineage. 

Original comment by gche...@gmail.com on 22 Feb 2013 at 2:39

GoogleCodeExporter commented 9 years ago
Is it possible that I can get around this constraint?
If my sample size is not large and minor proportion is small, this "error" is 
very likely.

Are there smart ways to get around this?

Thank you!

Original comment by weiwei.z...@gmail.com on 22 Feb 2013 at 2:44

GoogleCodeExporter commented 9 years ago
Not really.  This is just the way the algorithm is constrained.  It is 
optimized for larger sample sizes.  You should just use ms, which is the 
reference program, and is already very fast for small samples. 

Original comment by gche...@gmail.com on 22 Feb 2013 at 2:53

GoogleCodeExporter commented 9 years ago
I don't have a deep intuition on the algorithm.

My guess is you will have a flag for each lineage, this flag
tells the grouping/population designation. e.g. lineages
with the same flag are allowed to coalesce. 
this -ej/-es will change/increase this flag some how. Maybe 
my guess is too wild. 

macs has a lot good features.  I want to get large segments,
but smaller sample sizes.

* so, the -ej/-es require both populations to have non-zero lineages?
or just the one, that is splitting/moving?

For example, 
a) if -es, then this population that is splitting, has to have >=1 lineages
b) if -ej time x y, then do we need to have #lineage(x) and # lineage(y) >=1
or we just need # lineage(x) >=1?

## on a very curious question,
If say, you set the -es time pop, time to be very ancient,
then the sample already coalescent into a single lineage,
this will trick the program to complain also?

Original comment by weiwei.z...@gmail.com on 22 Feb 2013 at 3:07

GoogleCodeExporter commented 9 years ago
the join happens further back in time than the split.  you will just need an 
extant chromosome from *both* populations to execute the join. feel free to 
peruse the function traverseEvents in algorithm.cpp if you want to understand 
it better.

Original comment by gche...@gmail.com on 22 Feb 2013 at 11:34

GoogleCodeExporter commented 9 years ago
really appreciate the explainations. 
I really wish this can be accomondated... ...

Thanks again!

Original comment by weiwei.z...@gmail.com on 23 Feb 2013 at 6:49