Clinical-Genomics / scout

VCF visualization interface
https://clinical-genomics.github.io/scout
BSD 3-Clause "New" or "Revised" License
152 stars 46 forks source link

dismiss variant text mismacth in the report #1838

Closed 4WGH closed 4 years ago

4WGH commented 4 years ago

I've notced that this text in the report bild

this is because choosen "unstudied variant type" to dismiss the variant

The 2 text are quite different why? Expecially what is the report is not necessarely the reason a variant is unstudied.

I find quite dangerous that the text we choose at the variant level do not match what is written in the report.

northwestwitch commented 4 years ago

Hi @4WGH, what you see in the report is only the extended explanation of what you have selected in the dismiss option on the variant page.

We reasoned that the extended explanation would be too long to be included in the variant dropdown. On the other end it's more informative to have an exhaustive description of the dismissal reason on the general report, especially if you have to present the report to people who don't use scout. But I can assure you that the two descriptions (short and long) are just 2 ways of describing the same thing. This is the list of dismiss options, for reference: https://github.com/Clinical-Genomics/scout/blob/80c127e27887539156c83c84ac640b0c67a50070/scout/constants/variant_tags.py#L139

northwestwitch commented 4 years ago

Of course the descriptions could be improved. So feel free to suggest edits!

dnil commented 4 years ago

Precisely, on both the history and malleability! This particular option has been around since 2018 (#953?), when someone coined it, but it is quite possible to suggest clarifications. Or do you perhaps have a need for an additional option?

4WGH commented 4 years ago

the one in the exemple we use for deep intronic variants. so the text in the report do not match.

dnil commented 4 years ago

I typically use “splicing unaffected” plus possibly “no protein function” and/or “not in disease transcript” for that. Still, if intronic variation has not been shown an important variation type for this gene and disorder, I would say your call is also good, and the extended description is quite to the point?

4WGH commented 4 years ago

splicing unaffected for me is only if proven with mRNA studies.

4WGH commented 4 years ago

it remains that unstudied variation type and the text in the report are not so obviuosly connected

dnil commented 4 years ago

Ok. Could you please clarify your reasoning and/or provide an alternative so that it is a bit easier to discuss their relative merits? I'm sure you have a point somewhere, otherwise you would not be writing, but it just doesn't come across friend!

To me "Unstudied variation type" - "In a gene where mainly other types of variation (e.g. repeat expansion) are established as pathologic." is perfectly clear. It would be most applied for missense in repeat expansion genes, missense in LoF genes or conversely LoF in genes with GOF missense. I don't think you are out of order using it for intronic variants if there is no evidence yet of regulatory and/or splice variation causing disease.

If there was nothing else, I'll close this for now: its a fun bar/coffee conversation topic for another time. But do return to it if you have suggestions for changes and/or additions - they are quite welcome!

4WGH commented 4 years ago

SUGGESTION: "deeply intronic variant" as a dismissing reason

dnil commented 4 years ago

😄 Cheers!

The problem with that is that is that there are perfectly causative deep intronic variants described for some genes. Ofcourse, we would not dismiss them then, but it also doesn't quite capture the reason why the variant is dismissed. It only describes what can automatically be determined from the coordinates (which e.g. no refseq transcript could also be accused of). I would first expand it to its longer description: "deeply intronic variant" - "Deeply intronic variants have not been established as causative for this gene.". Notice however that this is a special case of the above.

ielvers commented 4 years ago

I run into this too and I agree with Michela that either "Unstudied variation type" should have a much more general description, or we need another reason for dismissing variants. I use "Unstudied variation type" for variants that fall outside of our criteria for basic analyses, and the description doesn't match that.

ielvers commented 4 years ago

For example this variant: https://scout.scilifelab.se/cust003/20009/b9e0015568f181815882add0a606d2bc It is in the disease transcript, but not exonic in that transcript. So I want to dismiss it. I could run Alamut and hypothesize about effects on transcription, but that is not within our baseline analysis. It is a type of variation that we don't study, but I can't say that it's mostly other types of variation that are established as pathogenic for this gene.

dnil commented 4 years ago

Thanks for weighing in @ielvers! Could you specify a bit more in detail what property of the variant you wish to highlight as the reason for dismissing it? Is it also deeply intronic variants, or something else entirely? Try to be as concrete as possible with examples.

Is the issue you have perhaps the single example with repeat expansions and missense? We could remove that part, though I think it kind of makes it easier to follow.

It is very easy to introduce an option - though hell to remove or change one, since they will exist on old cases. If the list becomes too long it will become difficult to use what is intended as a quick feature. You can always type free text in the comment if things are a bit more complicated.

I suggest we run any by the Squirrel meeting next time I'm there and if we can agree there, let's add away, ok?

dnil commented 4 years ago

For example this variant: https://scout.scilifelab.se/cust003/20009/b9e0015568f181815882add0a606d2bc It is in the disease transcript, but not exonic in that transcript. So I want to dismiss it. I could run Alamut and hypothesize about effects on transcription, but that is not within our baseline analysis. It is a type of variation that we don't study, but I can't say that it's mostly other types of variation that are established as pathogenic for this gene.

Right, that's what the "not in RefSeq transcript" was intended for. This variant is only on the ENSMBL gene model, not the RefSeq one. Its possible that this will give a protein effect in some tissue somewhere, but clinically we just don't know.

To be precise, this situation is different to a "true" deep intronic variant since ENSMBL have some level of evidence that this is protein coding, likely in the form of cDNA from some tissue, but not from multiple tissues, or just from some individuals/conditions.

ielvers commented 4 years ago

Bad example then :) I'll try and find a better one. I only use "not in Refseq transcript" when there is no annotation at all in the Refseq transcript, but I might have to change that policy.

4WGH commented 4 years ago

c.*346T>A

this type of variant can be dismissed with "unstudied variation type"

I think "unstudied variation type" can be used for several type of variant not fitting in the other dismissing reason.
where "In a gene where mainly other types of variation (e.g. repeat expansion) are established as pathologic" is one of several posisbilities :)

the misundertanding I think lies on the fact that the interpretation of what "unstudied variation type" means is different for different user.

I still like to be able to choose "unstudied variation type" :) and maybe we could use it in the report too?

dnil commented 4 years ago

c.*346T>A

this type of variant can be dismissed with "unstudied variation type"

Yes, it can certainly be! For some genes, this will be a 3'UTR variant. They can be a known or potential causative - or not yet studied/described for that gene. For some genes it would be a "downstream" or "intergenic" variant. That can certainly also be the same, though the number of genes with causative such variants is now considerably smaller. This still fits the current definition.

I think "unstudied variation type" can be used for several type of variant not fitting in the other dismissing reason. where "In a gene where mainly other types of variation (e.g. repeat expansion) are established as pathologic" is one of several posisbilities :)

the misundertanding I think lies on the fact that the interpretation of what "unstudied variation type" means is different for different user.

Good explanation - I'm with you so far! 😅 Could you then just provide an example that does not fit the current definition, and we can extend it?

Perhaps this is not about the actual definition, just a worry that a situation may arise? Would it help if the label (e.g. "unstudied variation type") or a direct rendering of it is also shown on the report? I quite agree with @northwestwitch that we should not show the label without the description, since the short label is not so precise, but I suppose we could e.g. include the label text in the description.

4WGH commented 4 years ago

c.-387C>T one more exemple

upstream this time :)

trying to collect all possile exemple :)

Then we can try hoe to proceed if changing the explanation in the report to include all pobbibilities or make new dismiss reasons :)