BGT missing SNPs - Githubissues

I believe that the difference is due to that "bgt import" ONLY imports entries if the FILTER field is "PASS"

I had seen that in the XXX.bgt.bcf you are lacking all the variants that has VQSRTrancheXXX in the FILTER field of the original VCF. When I count only PASS variants the bgt data has slightly more entries (due to splitting multiallelic variants to atomic).

So THERE SHOULD BE HUGE WARNING at the import manual that ONLY PASS variants are lifted. Remember FILTER, INFO (and may be ID?) VCF fields are cleared, so there is no way to distinguish between valid and invalid variants thus it seems logical to import only variants that are PASS.

I know technically both rs number (usually placed in ID) and the FILTER (PASS, etc) information could be placed into the variant annotation fmf file as an extra tag, however it wouldn't save space, and at the moment the included javascript does not lift it. On the contrary it would be nice if the VCF output could be more standard conformant and have these meaningful fields kept (included in the bgt bcf and queriable like you can query region etc)

Zoltan

lh3 / bgt

BGT missing SNPs #12