Closed eudesbarbosa closed 2 years ago
Genomic variant GRCh37:4:6349605:C:T
.
Transcript NM_020416.4
is missing in VarFish. Variant Validator returns nonsensical output.
We do two link-outs to Ensembl, one for the variant and one for the gene. As it was not specified which link-out was meant, I checked both:
Linkout for the variant
This links correctly to the GRCh37 genome.
Linkout for gene in fold-out
Ensembl ID
in the gene card links wrongly to the GRCh38.
Linkout to ensembl gene needs to be changed.
Affected Components
Affected Modules/Files
Required Architectural Changes
None.
Resolution Sketch
www.ensembl.org to grch37.ensembl.org
@eudesbarbosa I fixed the broken link-out (at least to my understanding). I couldn't follow the issue with variantvalidator. this sounds like an separate issue. In this case, could you please open a new bug report for this?
@stolpeo, I scan the merge request and just to clarify, is the used link depending somehow on the Genomebuild info? I'd assume that we have already samples processed with GRCh38 and that could lead to problems for those.
Regarding the VariantValidator, I will write a different issue.
@eudesbarbosa Yes, the link is based on genomebuild. Right now, the GRCh38 development branch is not yet merged into the main branch. I would prefer to have the link switch in the GRCh38 branch rather than integrating GRCh38 related stuff before introduction of the actual feature.
Thank you for the clarification.
Regarding VariantValidator, I could not replicate the error. It seems the transcripts displayed in the query is that same as the ones displayed in the 'Transcript information' table. Plus, the warning displayed can be ignored - according to their website: RefSeqGene record not available: Only some genes have an NCBI RefSeqGene record. This warning simply indicates that no RefSeqGene record exists for the gene in which the variant that is being validated resides.
I changed my mind and included the switch anyway after revising @holtgrewe MR #220 and not finding a change in the according lines. So this commit should make the ensembl link-out future proof.
Issue Internally VarFish displays gene/transcript information associated with GRCh37, but it displays results from GRCh38 while querying Ensembl. This mix leads to confusion.
Example Gene PPP2R2C (chr4:6,349,605) The link provided by VarFish refers to GRCh38 : http://www.ensembl.org/Homo_sapiens/Gene/Summary?g=ENSG00000074211;r=4:6320578-6563600 ...but displayed internally GRCh37: http://grch37.ensembl.org/Homo_sapiens/Gene/Summary?db=core;g=ENSG00000074211;r=4:6322305-6565327
Expected behaviour Consistent information displayed internally and queried from external sources, i.e., same genome version.
Additional info Issue also impacts VariantValidator: https://rest.variantvalidator.org/VariantValidator/variantvalidator/hg19/chr4-6349605-C-T/all?content-type=application%2Fjson https://rest.variantvalidator.org/VariantValidator/variantvalidator/hg38/chr4-6347878-C-T/all?content-type=application%2Fjson