Closed palomo11 closed 4 years ago
Hello, The header-check fails if something is not right in the REF, ALT, or GT fields per VCF line. Sites with missing data are those VCF lines that have missing genotypes. They contain entries such as "./." or "." You can use one of the missing-data strategies implemented in RAiSD to include such sites in the analysis.
Hi,
I have analysed 12 bacterial populations. I have been able to get the µ statistics and the Manhattan plot, but when I look into the sites and SNP retained, I can see that most of the sites are discarded. See a couple of examples below:
Command: RAiSD -n Genome1_D -I Genome1.vcf -f -y 1 -P -D
Another example:
Command: RAiSD -n Genome13_D -I Genome13.vcf -f -y 1 -P -D
Do you know why most of the sites are discarded? Why the: failed "header" check could happen? and what does exactly mean: sites with missing data?
Thanks in advance.