Sorry for this (I guess) basic question, but I did not find the answer in the README.md file nor in the paper (Page et al. 2016).
I try to convert FASTA alignments into a SNP-extracted VCF format for downstream analyses. Some alignments are for nuclear markers, and I work on a polyploid organism, so I sometimes have more than 2 haplotypes for a given individual, but all are properly phased.
And I indeed got a .vcf file. But in this file, each allele seems coded as a homozygous individual, I see no 0/0/1 or even 0/1 in the output as expected, but rather only 0, 1 and 2 (like haploid calls).
How could I get an output so that phasing information and heterozygosity are considered? Is there an option in snp-sites that I missed? Or do I have to adapt my input, and how? (Like, loosing the phasing information by merging the alleles, getting only 1 sequence per individual but with ambiguities?! Is that mandatory?)
Hello,
Sorry for this (I guess) basic question, but I did not find the answer in the README.md file nor in the paper (Page et al. 2016).
I try to convert FASTA alignments into a SNP-extracted VCF format for downstream analyses. Some alignments are for nuclear markers, and I work on a polyploid organism, so I sometimes have more than 2 haplotypes for a given individual, but all are properly phased.
My FASTA input is formated as follow:
I used a basic command:
snp-sites -v -o out.vcf in.fas
And I indeed got a .vcf file. But in this file, each allele seems coded as a homozygous individual, I see no 0/0/1 or even 0/1 in the output as expected, but rather only 0, 1 and 2 (like haploid calls).
How could I get an output so that phasing information and heterozygosity are considered? Is there an option in snp-sites that I missed? Or do I have to adapt my input, and how? (Like, loosing the phasing information by merging the alleles, getting only 1 sequence per individual but with ambiguities?! Is that mandatory?)
Thank you for any answer.