Ronak found that for the TERT promoter mutations that were added to the hotspots list, the hotspot_whitelist column does not get updated because the string "Promoter_1295250" is not a valid HGVSp annotation, and so it doesn't pass a regex for "p.\D+\d+" in the code.
We would like to change the HGVSp_Short for the two TERT mutations in the hotspots list from:
Promoter_1295250
Promoter_1295228
To:
p.0 rs1561215364
p.0 rs1242535815,CA557858711
As p.0 is the suggested annotation for non-coding promoter mutations as designated by the HGVS (https://www.hgvs.org/mutnomen/recs-prot.html - see "changes which affect the promoter of a gene")
In addition to changing this file, the actual fix for the bug would be to remove the aa_pos check from this section of tag_hotspots, and tag all hotspots from the file base on "chr", "position", "ref", and "alt", regardless of whether they have a valid HGVSp annotation.
Change:
aa_pos = re.match( r'^p\.\D+(\d+)', row['HGVSp_Short'])
if aa_pos:
hotspot[key] = aa_pos.group(1)
To this (make the hotspot variable a set instead of a dictionary):
hotspot.add(tuple(key))
The first two lines prevent us from tagging hotspots that don't have a protein-coding annotation (because "p.0" still would not be a match), and they don't seem to serve any other purpose. But if there's some reason we need to validate the HGVSp with a regex please let me know before I make this change.
Ronak found that for the TERT promoter mutations that were added to the hotspots list, the hotspot_whitelist column does not get updated because the string "Promoter_1295250" is not a valid HGVSp annotation, and so it doesn't pass a regex for "p.\D+\d+" in the code.
We would like to change the HGVSp_Short for the two TERT mutations in the hotspots list from:
To:
As p.0 is the suggested annotation for non-coding promoter mutations as designated by the HGVS (https://www.hgvs.org/mutnomen/recs-prot.html - see "changes which affect the promoter of a gene")
In addition to changing this file, the actual fix for the bug would be to remove the
aa_pos
check from this section of tag_hotspots, and tag all hotspots from the file base on "chr", "position", "ref", and "alt", regardless of whether they have a valid HGVSp annotation.Change:
To this (make the hotspot variable a set instead of a dictionary):
The first two lines prevent us from tagging hotspots that don't have a protein-coding annotation (because "p.0" still would not be a match), and they don't seem to serve any other purpose. But if there's some reason we need to validate the HGVSp with a regex please let me know before I make this change.