Open thomasyu888 opened 3 years ago
Taking a quick look at this, i reach similar initial conclusions as #32
Seems to be the same problem as #32, regarding the reference allele.
Is the variant in the intermediate file labeled as ONP or is the annotated record coming back from Genome Nexus as ONP
Similar to my recent comment in #32, These two cases include Reference_Allele inputs which are discrepant from the UCSC Browser results when querying these positions (believed to represent the latest/final version of the hg19/GRCh37 genome assembly, and to be consistent with the VEP cache version in use in genome nexus):
Chromosome Start_Position Reference_Allele UCSC_browser_hg19 Tumor_Seq_Allele2 Tumor_Sample_Barcode
17 7578456 TGGCGCG GCGGACG TGGCAAG SAGE-1
17 7578397 GAA TGG TCC SAGE-1
Because the reported Reference_Allele does not match the allele in the reference genome assembly in use by VEP, we are confirming that these cases should have been marked with a failure to annotate, maybe giving additional information that the cause was a mismatch in the Reference_allele column.
Input: input.txt
Intermediate files:
annotation-tools
intermediate files I must add the .txt at the end or github won't allow me to upload these. My understanding it theinput.txt.temp.annotated.txt
is the output from Genome Nexus. But because the annotation-tools allows us to include a directory with a list of mafs or vcfs, it annotates each one of those files separately.processed.txt
is all of these merged. input.txt.temp.annotated.txt input.txt.temp.txtprocessed: processed.txt