genome-nexus / genome-nexus-annotation-pipeline

Library and tool for annotating MAF files using Genome Nexus Webserver API
MIT License
8 stars 25 forks source link

Some complex variants with common prefix - suffix fail the annotation #259

Open rmadupuri opened 10 months ago

rmadupuri commented 10 months ago

These variants needs some pre-processing like removing both common prefix and suffixes, adjusting the coordinates etc., before passing to GN. It's a bit tricky in some cases.

Chr Start Pos End Pos Ref Allele Tumor Allele
5 112175377 112175378 CAAAAG CAAAAAT
17 7578231 7578232 CAAAT CAAAAAT
17 7578439 7578440 TTGT TGCTTGG
5 112175650 112175651 TAAAAAT TAAAAAAT
20 31022441 31022442 AGGGGGGGGT AGGGGGGGGGT
22 41574678 41574679 GCCCCCCCA GCCCCCCCCA
2 48033651 48033652 AAATT AAATAATT
17 11924243 11924244 GGCGGCAGCGGCAGCGGCAC GGCGGCAGCGGCAGCGGCAGCGGCAC
6 157100116 157100117 AGGCGGCGGCGGCGGCGGCT AGGCGGCGGCGGCGGCGGCGGCT
6 157099165 157099166 GTCCTCCTCCTCCTCCTCCTCCG GTCCTCCTCCTCCTCCTCCTCCTCCG
1 27023007 27023008 AGGCGGCGGCGGCGGCA AGGCGGCGGCGGCGGCGGCGGCA
7 55248999 55249000 TGGCCAGCGTGGAC TGGCCAGCGTGGCCAGCGTGGAT
7 55242460 55242461 ATCAAGGAATTAAGAGAAGCA ATCAAAGGAATTAAGAGAAGCA
7 55249015 55249016 CCCC CCCAACCCCT
7 151874147 151874148 CTTTTTTTTTG CTTTTTTTTTTG
17 7577091 7577092 GCCG GCCCCG
16 67660601 67660602 GATT GATATT
7 55248999 55249000 TGGCCAGCGTGGAC TGGCCAGCCGTGGAC
19 1221313 1221314 GCCCCCCG GCCCCCCTG
17 7578467 7578475 GACGCGGGT GACGCGGG
7 55242465 55242482 CAAGGAATTAAGAGAAGC CAA
9 1393967398 1393967410 GTCCTCGCCGAGG GCCCTC
X 66766357 66766356 GGCGGCGGCGGCGGCGGCGGC -
17 7577541 7577540 TTCATGCCGCCCATGCAGGAA -