Clinical-Genomics / scout

VCF visualization interface
https://clinical-genomics.github.io/scout
BSD 3-Clause "New" or "Revised" License
152 stars 46 forks source link

ncRNA genes and the clinical filter #4609

Open ielvers opened 6 months ago

ielvers commented 6 months ago

Hi,

Is there a way to include variants in RNA genes without getting all non-coding variants in regular protein-coding genes (or intronic variants in protein-coding genes that are also ncRNA variants in another gene, that is not in the database), with the clinical filter?

Case 22378 has two variants in RNU7-1. The gene fits the phenotype really well. One was kept with the clinical filter (for being reported as pathogenic in ClinVar), but the other variant was not included. If neither had been included, we would maybe not have found them.

Thank you :)

dnil commented 6 months ago

Mm, this is certainly a question that has got a bit of attention lately with eg. RNU4-2 (https://www.medrxiv.org/content/10.1101/2024.04.07.24305438v1). I've discussed it with pipeline devs, but was primarily concerned with (at some point) having them score high enough to be loaded at all, then filtering with them separately, or scoring them high enough to end up much higher than uninteresting ncRNA variants. From what I've seen so far, rarity and evolutionary conservation does tend to bump these kind of ncRNA variants up a decent bit in score already. I take it that was what you saw as well? I have toyed with adding ncRNA_exonic and non_coding_transcript_exon_variant to the Clinical filter. It does however add a few variants to the typical large panel (say 20%), and analyst time is a bit constrained right now.

Very nice that your variant was in ClinVar, and that the other was loaded. My question back to you might be if you would catch it even if it was shown on the filtered list, but being annotated only non_coding_transcript_exon_variant? I know I regularly skimmed over those if there was no specific question about RNA genes, or say an extended family available to narrow the variant list.

In waiting for a better scoring algorithm, we could perhaps introduce a panel level property for ncRNA genes, and if such a gene is included on the panel its variants are always loaded and shown?

It would also be cool do make a separate ncRNA variantS view, but perhaps we would require some more annotation and predictors tailored to such variants to warrant that. At least it would be very clear that there is one more thing to look at with a different set of binoculars.

northwestwitch commented 6 months ago

Could it be that the other variant is not returned by the clinical filter because has no_assertion_criteria_provided as Revstat?

image

Looks like our clinical filter returns only variants withs Revstat in this list: mult|multiple_submitters|single|single_submitter|exp|reviewed_by_expert_panel|guideline|practice_guideline

dnil commented 6 months ago

Hi @northwestwitch! Interesting, we might want to revisit that as well in another issue. At the time, those were typically rather noisy. But it is not the question here: all exonic, but intrinsically non-coding variants of interest in ncRNA will certainly not be in ClinVar. The clinical filter does currently filter out clearly pathogenic variants that have non_coding_transcript_exon_variant as their highest functional annotation.

northwestwitch commented 6 months ago

I am a bit confused, sorry, this variant is returned by the filter:https://scout.scilifelab.se/cust003/22378/4222ff0803c24c7cafb4eaa2d3476c4b

This other variants is not: https://scout.scilifelab.se/cust003/22378/74eb0bd2c558ea1f0bbace8fec2afacb

They look the same in terms of consequences? 🤔

dnil commented 6 months ago

Hihi, I didn't look at the case, only read the description. But, we can still both be right! 😉 Two different questions here! 1) this case 2) the general case of causative ncRNA variants

For this particular case, it would quite as you say be enough to relax to the criteria for ClinVar clinical significance. It's unfortunate that there are no more observations in ClinVar for it, with criteria - it seems to derive from a single (high quality) publication. I tentatively agree we should just do this. It would add a little bit of noise to the clinical filter, and we do already have quite a few ClinVar hits on a given case, but not as much as opening for general ncRNA. And ppl are already used to checking through e.g. variants with conflicting reports of pathogenicity.

All causative ncRNA variants will still not be in ClinVar, so the question of scoring and filtering ncRNA generally still remains open, even if we will probably for quite some time have to require them to have some kind of functional evidence to report them as causative.

ielvers commented 6 months ago

Hi! In general, if the filtered list contained a lot more ncRNA variants, we would mostly ignore them, so that would defeat the purpose. If the filtered list only (mostly) had ncRNA variants from RNA genes in our panels, we would absolutely look at them. A "panel level property for ncRNA genes, and if such a gene is included on the panel its variants are always loaded and shown" would be awesome.

I agree with @northwestwitch, I was also surprised that the other variant was filtered out despite being reported as pathogenic!

northwestwitch commented 6 months ago

We could relax the trusted revstat level used for filtering clinical variants - perhaps just accepting all variamts reported as pathogenic or likely pathogenic?

One thing I would do is reporting both these variants to ClinVar, so that revstat becomes "multiple submitters" or something searchable by us

ielvers commented 6 months ago

I will absolutely report these variants to ClinVar but my concern is all other variants in RNA genes that may not be in ClinVar at all :)

dnil commented 6 months ago

Yes, got you there as well. We'll discuss a bit more again, but we should have both some solution to enable showing more ncRNA hits in general - for genes where this is relevant, and try a bit harder to distinguish good clinvar info from noise.