awilfert / PSAP-pipeline

14 stars 9 forks source link

Incomplete description of gene names by ANNOVAR #6

Open dconrad opened 7 years ago

dconrad commented 7 years ago

The current PSAP pipeline has been developed around ANNOVAR variant annotation. A few issues have arisen based on ANNOVAR conventions, and we are currently exploring the use of alternative variant annotation packages (e.g. VEP). There is a column of annotation beginning with "AAChange", containing a list of all possible amino acid change resulting from a variant, and a second column, "Gene.wgEncodeGencodeBasicV19", which provides a gene ID or list of gene IDs. in some cases, there is a mismatch between the geneIDs used in "AAChange" and the geneIDs listed in "Gene.wg...". Specifically, a geneID used in "AAChange" is missing from the geneIDs in "Gene.wg". This is problematic because we use the "Gene.wg.." column for determining which gene to use for PSAP lookup.