sigven / pcgr

Personal Cancer Genome Reporter (PCGR)
https://sigven.github.io/pcgr
MIT License
254 stars 48 forks source link

Handling general mutations #37

Closed MrsLaviniaG closed 6 years ago

MrsLaviniaG commented 6 years ago

Dear Sigve

I wondered how PCGR deals with the more generic entries in CIViC, e.g. https://civicdb.org/events/genes/1/summary/variants/512/summary#variant does not have a specific chromosomal entry, is this simply applied to any variant in a VCF that is annotated to ALK?

Thank you for all the effort that you are putting into this awesome tool.

sigven commented 6 years ago

Dear MrsLavinia:-)

Thanks for your question, which is very relevant. This is something most users probably are not aware of, but I have made an attempt to document how this is done through the PCGR documentation website:

http://pcgr.readthedocs.io/en/latest/annotation_resources.html#notes-on-variant-annotation-datasets

Basically, PCGR does not report clinical evidence items that relate a "gene mutation" to prognososis/diagnosis/drug sensitivity. In that sense, PCGR is conservative, only listing variants that have been reported at the codon, amino acid, or exact variant level. I have enabled flexibility in the code to also include biomarkers that have been mapped (e.g. by CiVIC) at the gene level, but this something I believe will add a lot of noise. E.g. if your query tumor carries 100 ALK mutations, then all of them will in principle match the biomarker(s) for "ALK mutation". And there is a good chance that most of them are benign, not acting in any way as a "ALK mutation" biomarker. See my point? If these general biomarkers are going to be supported I believe one has to incorporate other mechanisms to highlight/prioritise the query variants that are most likely acting as "ALK mutation" biomarker.

I have also tried to outline how the variants are classified into tiers here: http://pcgr.readthedocs.io/en/latest/tier_systems.html

Looking at this now, I noted that I have not properly documented how PCGR considers the level of an evidence item as strong or weak. Essentially, PCGR adopts the evidence levels assigned by CIViC, and classifies A/B as strong, and C/D/E as weak. This as an attempt to adhere somewhat to the ACMG recommendations.

I'd be happy to have your comments.

regards, Sigve

MrsLaviniaG commented 6 years ago

Dear Sigve,

Thanks so much for the detailed reply, it is greatly appreciated. I am currently looking at specific annotations and if I discover anything useful, I will post it here. As a very minor point I noticed that the links to the TSGene 2.0 database are incorrect (e.g. on this page https://pcgr.readthedocs.io/en/latest/output.html?highlight=tsgene), they should point to https://bioinfo.uth.edu/TSGene/. Thanks again for all of your hard work on this very valuable piece of software.

with regards,

Lavinia.

sigven commented 6 years ago

Dear Lavinia, Thanks for notifying me! At one point I thought I had fixed this, but clearly not in all places :-)

regards, Sigve