Closed sunsong95 closed 3 years ago
Hi @sunsong95,
Very good observation! The mapping_rank
variable is created during consolidation of CIViC data, it's essentially a numeric indicator for how precise a given biomarker is reported in the literature (and accordingly by CIViC). I believe this matter is frequently ignored by many, but it's an important aspect for variant interpretation, and for how a given variant in a given tumor can be mapped towards biomarkers. Essentially, a mapping_rank
of 1 means that the biomarker (variant) was mapped exactly to the genome (with ref and alt alleles, en example being BRAF V600E), a mapping_rank
of 2 means that the biomarker was mapped to a codon (e.g. BRAF V600), a mapping_rank
of 3 is for the exon (e.g. EGFR exon 19), 4 is at the gene level (mutations), and 5 is also at the gene level (non-mutations, i.e. expression biomarkers etc).
Yes, the alteration_type
is modified internally in PCGR, basically for convenience and slight simplificiation, but essentially using the CIViC data to set this variable.
Hopefully this may clarify somewhat.
kind regards, Sigve
Hi @sigven
Thank you for your reply.
I also want to know if I can update the annotation resources files (e.g. CIViC or cancerhotspots) by myself so that the clinical annotation notes are always up-to-date. If I only update the annotation file in the corresponding folder (e.g. ~/grch37/civic/), can I get the correct result/report?
For example, I noticed that the data package of civic (https://civicdb.org/releases) is updated almost every month. So if I can update the resource pack in time, it would be great.
Thanks!
Hi @sunsong95,
I am afraid it will not work just yet, although I surely realize that this will be the optimal situation, to ensure that all databases are up-to-date. PCGR relies on a fairly large number of resources, and I have established multiple update scripts for these, but as now not yet streamlined so that users can update whenever they like. I hope to support such a strategy in the future, meanwhile I will try to update the bundle more frequently than what i have done recently. Sorry for this slight inconvenience, but it requires a fair amount of work to ensure that it works without errors for the users.
kind regards, Sigve
Thanks for your research!
I'm very interested in how pcgr integrates the scattered annotation resources into the final results. I observed that the files in the civic directory of grch37 data bundle are created by files in https://civicdb.org/releases, but I don't quite understand how to get the column named "mapping_rank" which in the civic.biomarkers.tsv, and what does it mean?
In addition, column named "alterationtype" is modified based on the civic source infomation? For example, I noticed "missense variant,transcript_ Fusion" was changed to "TRANSLOCATION_FUSION_MUT"
Thanks!