LohseLab / agemo

Laplace transformed coalescence time distributions
http://agemo.readthedocs.io
GNU General Public License v3.0
4 stars 3 forks source link

Branch types for unphased diploids under panmictic model #24

Open DerekSetter opened 10 months ago

DerekSetter commented 10 months ago

As I understand it, the branch types possible for unphased diploid individuals should be the same regardless of the underlying set of events in the model. I don't get the result that I expected for a single-pop model, though. Some example code and output is below. It looks like the btc.labels_dict is correct, but the binary_representation is odd.

Isolation model -- results as expected

Code

sample_configuration = [(), ('a', 'a'), ('b', 'b')] btc = agemo.BranchTypeCounter(sample_configuration,rooted=True,phased=False) print( btc.labels, btc.labels_dict, btc.binary_representation, sep='\n' )

Output

['a', 'b', 'aa', 'ab', 'bb', 'aab', 'abb'] {'a': 0, 'abb': 6, 'b': 1, 'aab': 5, 'aa': 2, 'bb': 4, 'ab': 3} [[1 0 0 0] [0 0 1 0] [1 1 0 0] [1 0 1 0] [0 0 1 1] [1 1 1 0] [1 0 1 1]]

Panmictic model, unphased, rooted

Code:

sample_configuration = [('a','a','b','b')] btc = agemo.BranchTypeCounter(sample_configuration,rooted=True,phased=False) print( btc.labels, btc.labels_dict, btc.binary_representation, sep='\n' )

Output:

['a', 'b', 'aa', 'ab', 'bb', 'aab', 'abb'] {'a': 0, 'abb': 6, 'b': 1, 'aab': 5, 'aa': 2, 'bb': 4, 'ab': 3} [[1 0 0] [1 1 0] [1 1 1]]

Panmictic model, unphased, unrooted

Code:

sample_configuration = [('a','a','b','b')] btc = agemo.BranchTypeCounter(sample_configuration,rooted=False,phased=False) print( btc.labels, btc.labels_dict, btc.binary_representation, sep='\n' )

Output:

['a', 'b', 'aa', 'ab'] {'a': 0, 'abb': 0, 'b': 1, 'aab': 1, 'aa': 2, 'bb': 2, 'ab': 3} [[1 0 0] [1 1 0]]

GertjanBisschop commented 10 months ago

Right, a first issue I can see from this is the fact that sample configurations should be specified in a different way. Think of the msprime way to specify the sample configurations. I have assumed that unphased samples would only contain a single letter per population. For the following sample_configuration = ['a', 'a', 'a', 'a'] We then get this output (both unphased):

unrooted
['a', 'aa']
{'a': 0, 'aaa': 0, 'aa': 1}
[[1 0 0]
 [1 1 0]]
rooted
['a', 'aa', 'aaa']
{'a': 0, 'aaa': 2, 'aa': 1}
[[1 0 0]
 [1 1 0]
 [1 1 1]]