stschiff / msmc2

GNU General Public License v3.0
53 stars 9 forks source link

confusion with running cross-population annalysis #32

Closed Aannaw closed 2 years ago

Aannaw commented 3 years ago

Hello, Professor I have two species and each selects four samples ( eight haplotypes), I want to calculate the estimate coalescence rates using msmc2, I am confused with the description in the readme.md: you should generate a combined input file with eight haplotypes (see msmc and msmc-tools repositories), and then start three runs, and do I need to use the single-sample VCFs and mask-files of the all eight samples (sixteen haplotypes) of two species when I run the generate_multihetsep.py?

stschiff commented 3 years ago

Yes, that is correct. You can also use multi-sample VCFs for all eight samples, with individual masks. I recommend that you try to understand how the multihetsep-format (the input format for MSMC) works and what information it contains. You can generate it in multiple ways, and generate_multihetsep.py is just a little script for a specific workflow.