knausb / vcfR

Tools to work with variant call format files
240 stars 54 forks source link

vcfR reading INFO/DS field #174

Open meixilin opened 3 years ago

meixilin commented 3 years ago

Hi,

Thank you so much for making this amazing software possible! I think there might be a small bug regarding reading the INFO field when the gene annotations field including CDS are mixed up with the DS field.

For example a dummy INFO field contains AN=100;SIFTINFO=T|TRANSCRIPT|GENE|NA|CDS|SYNONYMOUS|S/S|60|0.52|3.44|840|novel|TOLERATED;SOR=1.199;VariantType=SNP then it would be classifed as DS==TRUE despite that the DS; field is not actually presenting in the derived vcfR object.

Thank you for looking into it again!

knausb commented 3 years ago

Hi @meixilin , thanks for your post. But I'm afraid I don't understand your issue well enough to address it. Let's see if we can create a reproducible example to. That way I ma be able to reproduce the issue and hopefully address it. Thanks! Brian

library(vcfR)
#> 
#>    *****       ***   vcfR   ***       *****
#>    This is vcfR 1.12.0 
#>      browseVignettes('vcfR') # Documentation
#>      citation('vcfR') # Citation
#>    *****       *****      *****       *****
data("vcfR_test")
vcfR_test@fix[,"INFO"][1] <- "AN=100;SIFTINFO=T|TRANSCRIPT|GENE|NA|CDS|SYNONYMOUS|S/S|60|0.52|3.44|840|novel|TOLERATED;SOR=1.199;VariantType=SNP"
getINFO(vcfR_test)
#> [1] "AN=100;SIFTINFO=T|TRANSCRIPT|GENE|NA|CDS|SYNONYMOUS|S/S|60|0.52|3.44|840|novel|TOLERATED;SOR=1.199;VariantType=SNP"
#> [2] "NS=3;DP=11;AF=0.017"                                                                                               
#> [3] "NS=2;DP=10;AF=0.333,0.667;AA=T;DB"                                                                                 
#> [4] "NS=3;DP=13;AA=T"                                                                                                   
#> [5] "NS=3;DP=9;AA=G"

Created on 2020-11-10 by the reprex package (v0.3.0)