edgardomortiz / vcf2phylip

Convert SNPs in VCF format to PHYLIP, NEXUS, binary NEXUS, or FASTA alignments for phylogenetic analysis
GNU General Public License v3.0
294 stars 85 forks source link

output for phased data #29

Open erikenbody opened 3 years ago

erikenbody commented 3 years ago

Hi there,

Thank you for making this super useful tool! Issue #23 was a really helpful improvement for utilizing heterozygous sites. I was wondering if it would be feasible to include an option for an output file that is two alignments per diploid individual? E.g.

Ind1_A ATGCAA Ind1_B GTACCG

This would provide a reasonable alternative to discarding het sites or selecting them randomly when the data is phased confidently.

Thank you! Erik

edgardomortiz commented 3 years ago

I have been recently working with phased genotypes. I will give it a try, but I am really busy at the moment, I will work on that perhaps on the weekend.

Edgardo

matthewglasenapp commented 3 years ago

Was this added as a feature? This would be very useful to me as well.

sofiatorreggiani commented 2 years ago

I would be very interested in this as well. Thank you! Sofia

bbandriola commented 1 year ago

Hi there, Is there any update about the phasing option?

chrchang commented 1 year ago

plink2 (https://github.com/chrchang/plink-ng ; will post precompiled binaries to https://www.cog-genomics.org/plink/2.0/ after I test for a few more hours) now supports this for diploid data. Sample usage:

plink2 --vcf [vcf filename, could be gzipped] --geno [max missing-call rate] --snps-only --export phylip-phased used-sites --out [output filename prefix]

Replace "phylip-phased" with "phylip" for regular phylip output.