Closed sbenny1230 closed 2 years ago
Issue #204 mentions the error is caused by splign alignment. This suggests the genebuild mapper I added probably isn't correct.
Hi @sbenny1230 . Again, no worries, let me take a look. You are also on a learning curve r.e. the software. This is all stuff that counts in your MSc. I want to encourage more open coding etc, so we will discuss how to write it up. If you see this issue https://github.com/openvar/variantValidator/issues/387 and make sure I get the correct branch, I'll see if I can get this debugged today and get you ready for the next tasks
So I was looking at the error again and realised its passing in tx_ac=NM_000088.4
when it should be tx_ac=ENST00000225964.10
.
There's a bit of code in vvMixinCore.py which automaps to an equivalent RefSeq transcript. I've commented this bit out (shown below) and I've got a different error now.
# ENST support needs to be re-evaluated, but is very low priority
# ENST not supported by ACMG and is under review by HGVS
if my_variant.refsource == 'ENS':
trap_ens_in = str(my_variant.hgvs_formatted)
sim_tx = self.hdp.get_similar_transcripts(my_variant.hgvs_formatted.ac)
for line in sim_tx:
if line[2] and line[3] and line[4] and line[5] and line[6]:
my_variant.hgvs_formatted.ac = line[1]
my_variant.set_quibble(str(my_variant.hgvs_formatted))
formatted_variant = my_variant.quibble
break
if my_variant.refsource == 'ENS':
error = 'Unable to map ' + my_variant.hgvs_formatted.ac + \
' to an equivalent RefSeq transcript'
my_variant.warnings.append(error)
logger.warning(error)
continue
else:
my_variant.warnings.append(str(trap_ens_in) + ' automapped to equivalent RefSeq transcript '
+ my_variant.quibble)
logger.info(str(trap_ens_in) + ' automapped to equivalent RefSeq '
'transcript ' + my_variant.quibble)
Error for same variant as before ENST00000225964.10:c.589-1GG>G
{
"flag": "warning",
"metadata": {
"variantvalidator_hgvs_version": "2.0.2.dev1+g6ecbf8e",
"variantvalidator_version": "1.0.5.dev273+g7d58e7e.d20220617",
"vvdb_version": "vvdb_2022_04",
"vvseqrepo_db": "VV_SR_2022_02/master",
"vvta_version": "vvta_2022_02"
},
"validation_warning_1": {
"alt_genomic_loci": [],
"annotations": {},
"gene_ids": {},
"gene_symbol": "",
"genome_context_intronic_sequence": "",
"hgvs_lrg_transcript_variant": "",
"hgvs_lrg_variant": "",
"hgvs_predicted_protein_consequence": {
"lrg_slr": "",
"lrg_tlr": "",
"slr": "",
"tlr": ""
},
"hgvs_refseqgene_variant": "",
"hgvs_transcript_variant": "",
"primary_assembly_loci": {},
"reference_sequence_records": "",
"refseqgene_context_intronic_sequence": "",
"selected_assembly": "GRCh37",
"submitted_variant": "ENST00000225964.10:c.589-1GG>G",
"transcript_description": "",
"validation_warnings": [
"ENST00000225964.10:c.589-1GG>G automapped to ENST00000225964.10:c.589-1_589delGGinsG",
"Removing redundant reference bases from variant description",
"Required information for ENST00000225964.10 is missing from the Universal Transcript Archive",
"Query gene2transcripts with search term ENST00000225964 for available transcripts"
],
"variant_exonic_positions": null
}
}
Nice work. I can look at this today.
Give me a few hours
Referencing here. I will clone https://github.com/openvar/variantValidator/tree/vv_ensembl_develop and branch from it.
OK after commit https://github.com/openvar/variantValidator/commit/38a1932dad1c865ccc22c68afbe0b1a99780ff03 I'm seeing the following output
{
"flag": "warning",
"metadata": {
"variantvalidator_hgvs_version": "2.0.2.dev1+g6ecbf8e",
"variantvalidator_version": "2.1.1.dev2+g294fd63",
"vvdb_version": "vvdb_2022_04",
"vvseqrepo_db": "VV_SR_2022_02/master",
"vvta_version": "vvta_2022_02"
},
"validation_warning_1": {
"alt_genomic_loci": [],
"annotations": {},
"gene_ids": {},
"gene_symbol": "",
"genome_context_intronic_sequence": "",
"hgvs_lrg_transcript_variant": "",
"hgvs_lrg_variant": "",
"hgvs_predicted_protein_consequence": {
"lrg_slr": "",
"lrg_tlr": "",
"slr": "",
"tlr": ""
},
"hgvs_refseqgene_variant": "",
"hgvs_transcript_variant": "",
"primary_assembly_loci": {},
"reference_sequence_records": "",
"refseqgene_context_intronic_sequence": "",
"selected_assembly": "GRCh37",
"submitted_variant": "ENST00000225964.10:c.589-1GG>G",
"transcript_description": "",
"validation_warnings": [
"ENST00000225964.10:c.589-1GG>G automapped to ENST00000225964.10:c.589-1_589delGGinsG",
"Removing redundant reference bases from variant description",
"Required information for ENST00000225964.10 is missing from the Universal Transcript Archive",
"Query gene2transcripts with search term ENST00000225964 for available transcripts"
],
"variant_exonic_positions": null
}
}
Do you get this? I did notice that Ensembl rest may be down so outputs might change!!!! SIGH!!!!
If you get this, this transcript seems to be missing from VVTA. I will check now
Fixed on branch vv_ensembl_develop
Describe the bug Error returned with TranscriptMapper when Ensembl transcript is selected.
To Reproduce
Output returned