Closed jjfarrell closed 1 year ago
Looking back at the documentation/code, it looks like the Graphtyper AAScore was originally intended just for SNPs and INDELs and therefore AASCore was intently dropped for SVs in recent versions. The training for the logistic regression model used in AAScore appears based on the GIAB SNV and INDELs. Did I get that right?
To help with SV QC, could a AAScore be developed for the SVs using the GIAB SVs as a training set also? Maybe consider a model for each variant type (SNV, DEL, DUP, INS) to increase the accuracy of the AAScore. For example, GATK pipeline runs two VQSR models trained on SNVs and small INDELs separately with different QC metrics.
When running Graphtyper 2.7, the info field output included the AAScore field along with the other QC fields. However, when we ran the newer 2.74, the AAScore was not found in the INFO field even though it was described in the VCF Header. Any suggestions with this?
zgrep Vers adsp_13710_pcrfree.graphtyper.chr1.vcf.gz
graphtyperVersion=2.7.4
bcftools_concatVersion=1.10.2+htslib-1.10.2
[farrell@scc-wr3 workarea.ont_manta.adsp_13710_pcrfree]$ zcat adsp_13710_pcrfree.graphtyper.chr1.vcf.gz|grep -v ^#|head|cut -f1-8