Closed silvewheat closed 4 years ago
Hi Yudong,
For converting to Phylip format you should be able to just use the variant sites in your VCF file. Also, I think that keeping both alleles for the diploids would be better as long as you aren't trying to run the individual-level tests. Because HyDe assumes independence among sites, it shouldn't matter which allele you assign to the two haploid individuals you make from the diploid VCF file. If you are wanting to run the test on individuals though, then I think using ambiguity codes for heterozygous sites and coding each individual as a single consensus sequence would be best
Hopefully this helps but let me know if you have any other questions
Paul
Hi Paul,
Thanks for you answers. I'll try as your suggestion.
Best, Yudong
There are always missing genotypes in VCF files. How to deal with these sites? Whether the program HyDe can handle "N"?
Hi @zsdxgl -- HyDe can handle missing data ('N'). It does this by trying to integrate over the uncertainty that missing data creates by assigning 0.25 to each of the possible nucleotides
Hello,
Since HyDe use sequence data in Phylip format as input. But I only have a VCF file. I'm not sure what is the best way to convert a VCF to sequence data. Specifically, I have the following questions:
Beast, Yudong