bioinformatics-centre / BayesTyper

A method for variant graph genotyping based on exact alignment of k-mers
86 stars 7 forks source link

Feature Request for prior vcf with bayesTyperTools combine #25

Open jjfarrell opened 4 years ago

jjfarrell commented 4 years ago

In the prior vcf file, there is a lot of details on the origin of the calls. However, it is not retained when combined with another vcf and all these variants have an ACO=prior rather than dbSNP150,GDK etc.

chr1    10019   .       TA      T       .       .       ACO=dbSNP150
chr1    10055   .       T       TA      .       .       ACO=dbSNP150
chr1    10128   .       A       AC      .       .       ACO=dbSNP150
chr1    10144   .       TA      T       .       .       ACO=dbSNP150
chr1    10146   .       AC      A       .       .       ACO=GDK:dbSNP150
chr1    10149   .       CCT     C       .       .       ACO=GDK
chr1    10165   .       A       AC      .       .       ACO=dbSNP150
chr1    10177   .       A       AC      .       .       ACO=1000g:dbSNP150
chr1    10218   .       AC      A       .       .       ACO=GDK
chr1    10228   .       TAACCCCTAACCCTAACCCTAAACCCTA    TACCCCTAACCCTAACCCTAAACCCTA,T   .       .       ACO=dbSNP150,dbSNP150
chr1    10230   .       AC      A       .       .       ACO=dbSNP150
chr1    10235   .       T       TA      .       .       ACO=1000g:dbSNP150
chr1    10249   .       AAC     A       .       .       ACO=dbSNP150
chr1    10254   .       TA      T       .       .       ACO=dbSNP150
chr1    10328   .       AACCCCTAACCCTAACCCTAACCCT       A       .       .       ACO=dbSNP150
chr1    10329   .       AC      A       .       .       ACO=dbSNP150
chr1    10352   .       T       TA      .       .       ACO=1000g:dbSNP150
chr1    10371   .       ACCCTAACCCTAACCCTAAC    A       .       .       ACO=GDK
chr1    10383   .       A       AC      .       .       ACO=dbSNP150
chr1    10389   .       AC      A       .       .       ACO=dbSNP150
chr1    10433   .       A       AC      .       .       ACO=dbSNP150
chr1    10439   .       AC      A       .       .       ACO=dbSNP150
chr1    10458   .       A       AC      .       .       ACO=dbSNP150
chr1    10616   .       CCGCCGTTGCAAAGGCGCGCCG  C       .       .       ACO=1000g:dbSNP150
chr1    10642   .       G       A       .       .       ACO=dbSNP150
chr1    10891   .       CA      C       .       .       ACO=dbSNP150
chr1    11008   .       C       G       .       .       ACO=dbSNP150
chr1    11012   .       C       G       .       .       ACO=dbSNP150
chr1    11063   .       T       G       .       .       ACO=dbSNP150
chr1    11666   .       TAACAGG T       .       .       ACO=GDK
chr1    12938   .       GCAAA   G       .       .       ACO=dbSNP150

Everyhting gets sets to prior rather than the original origin. It would be nice if that information is retailed if the vcf tag is prior.

chr1 10019 . TA T . . ACO=prior chr1 10055 . T TA . . ACO=prior chr1 10128 . A AC . . ACO=prior chr1 10144 . TA T . . ACO=prior chr1 10146 . AC A . . ACO=prior chr1 10149 . CCT C . . ACO=prior chr1 10165 . A AC . . ACO=prior chr1 10177 . A AC . . ACO=prior chr1 10218 . AC A . . ACO=prior chr1 10228 . TAACCCCTAACCCTAACCCTAAACCCTA TACCCCTAACCCTAACCCTAAACCCTA,T . . ACO=prior,prior chr1 10230 . AC A . . ACO=prior chr1 10235 . T TA . . ACO=prior chr1 10249 . AAC A . . ACO=prior chr1 10254 . TA T . . ACO=prior chr1 10328 . AACCCCTAACCCTAACCCTAACCCT A . . ACO=prior chr1 10329 . AC A . . ACO=prior chr1 10352 . T TA . . ACO=prior chr1 10371 . ACCCTAACCCTAACCCTAAC A . . ACO=prior chr1 10383 . A AC . . ACO=prior chr1 10389 . AC A . . ACO=prior chr1 10433 . A AC . . ACO=prior chr1 10439 . AC A . . ACO=prior chr1 10458 . A AC . . ACO=prior chr1 10464 . A AC . . ACO=adsp5k_gatk chr1 10616 . CCGCCGTTGCAAAGGCGCGCCG C . . ACO=adsp5k_gatk:prior chr1 10642 . G A . . ACO=prior chr1 10744 . A AC . . ACO=adsp5k_gatk chr1 10815 . T TC . . ACO=adsp5k_gatk chr1 10891 . CA C . . ACO=prior chr1 11008 . C G . . ACO=prior chr1 11012 . C G . . ACO=prior chr1 11063 . T G . . ACO=prior chr1 11666 . TAACAGG T . . ACO=prior

jonassibbesen commented 4 years ago

Thank you for the suggestion. This is a great idea. I unfortunately do not have time to work much on BayesTyper currently, but will keep this in mind for when I do.