Closed susannasiebert closed 5 years ago
For HGVS.p and HGVS.c, we can probably use variant.hgvs_expressions, instead of variant.aliases; minor help, but just pointing out that we have that much already.
We should definitely stick to hg37 if we need to pick only one.
After going over https://github.com/googlegenomics/gcp-variant-transforms/blob/master/docs/variant_annotation.md and http://snpeff.sourceforge.net/VCFannotationformat_v1.0.pdf this PR reorders and renames some CSQ fields. The two documents seem to contradict in some places (e.g., using
Gene Name
vsSYMBOL
). In those instances I decided to go with Variant Annotation/VEP field name since that is what bigQuery seems to recommend.HGVS.c
andHGVS.p
are another set of predefined fields we could consider adding to our annotations. We could extract them from variant aliases. This would be pretty straightforward, I believe, but we would need to decide how to handle multiple variant aliases that match each type (e.g. for build 37 vs build 38) since we can only pick one.