popgenmethods / SINGER

Sampling and inference of genealogies with recombination
MIT License
24 stars 4 forks source link

Correspondence between nodes and individuals #16

Open TymekPieszko opened 5 months ago

TymekPieszko commented 5 months ago

What is the correspondence between sample nodes and individuals? The output includes a vcfname_nodes_step.txt file with node times:

0
0
0
0
1515.1530588057719
10095.284900597178
13448.820346282408
...

For some analyses, it would be helpful if a vcfname_individuals.txt file was produced as well, for example:

ind        node
0        0
0        1
1        2
1        3
YunDeng98 commented 5 months ago

Hi @TymekPieszko the node order is the same as that in vcf, that the first haplotype in the first individual is the node 0 and the second haplotype in the first individual is the node 1, and so on. I will modify the program to produce the correspondence file shortly after.

TymekPieszko commented 5 months ago

Thank you @YunDeng98!! Yes, this would be useful, and if convert_to_tskit recorded this in the nodes table as well.

YunDeng98 commented 5 months ago

no problem, for tskit I can try adding the individual ID to the meta data of the node object. I am in the revision process now and that will involve me changing some parts of the node (mainly utility stuff rather than math). So I can definitely take these features in.