cBioPortal / cbioportal

cBioPortal for Cancer Genomics
https://cbioportal.org
GNU Affero General Public License v3.0
625 stars 478 forks source link

inconsistent annotation of mutations #2819

Closed tmazor closed 7 years ago

tmazor commented 7 years ago

The same mutation in the same study is inconsistently annotated: The last entry in the screenshot below has the same mutation as the samples above it but has a value of "NA" rather than "Medium" in the Mutation Assessor column The two samples above that don't have OncoKB logos even though all the other samples with that mutation do.

http://www.cbioportal.org/beta/index.do?session_id=5970e576498e5df2e292cb11&show_samples=false& image

onursumer commented 7 years ago

ma_last_entry_genomic_location

Actually, they are not the same mutation, the genomic location and the alleles are different for the last entry. That's why we don't have a corresponding Mutation Assessor value for this mutation.

tmazor commented 7 years ago

Interesting. So, it's a dinucleotide change that results in the same amino acid change. I presume that when the mutations table pulls from mutation assessor it's done at the nucleotide level and that's why the result is NA?

On the mutation assessor website, it's possible to run a query based on the amino acid change -- I can input "EGFR_HUMAN A289V" and I get a prediction. No matter what the underlying genetic change is, the functional impact of the amino acid change should be the same. Can the code be updated to query the amino acid change if the nucleotide level change is NA?

onursumer commented 7 years ago

@tmazor That sounds like a good idea! And it is certainly possible to improve the code. But, currently we are not actually querying the mutation assessor web service directly. Instead, we are using a pre-cached version of the mutation assessor data which is built by the nucleotide level query.

We have plans to update the mutation assessor column to use the latest web service, but that is not scheduled for the next release.

tmazor commented 7 years ago

Ok, sounds like this mutation assessor issue is a bigger change and maybe should maybe wait until the larger update to using the web service.

Any idea what's happening with those samples that don't have the OncoKB logo @onursumer ?

jjgao commented 7 years ago

@tmazor @onursumer Agreed on not worrying about mutation accessor for now. We have another project replace it with MA web api.

It's more important to find our the reason why OncoKB is missing for some of the rows.

tmazor commented 7 years ago

@onursumer @jjgao If I click on the sample ID for the two cases without OncoKB annotation, there is clearly something wrong with the patient/sample data:

http://www.cbioportal.org/beta/case.do#/patient?sampleId=TCGA-81-5911-01&studyId=gbm_tcga_pub2013 image

http://www.cbioportal.org/beta/case.do#/patient?sampleId=TCGA-06-0142-01&studyId=gbm_tcga_pub2013 image

Interestingly, the EGFR mutation does show up with an OncoKB logo on the patient page. But there is no patient level info and there isn't even a sample name listed.

inodb commented 7 years ago

@tmazor Yeah this is definitely an error. Looks like there is a problem on the patient view when there is no clinical data for a sample. I'm guessing OncoKB on the mutation mapper has the same problem. This is partially a data issue, because I assume al TCGA studies should have clinical data. But we should def also handle the case when clinical data is not available

inodb commented 7 years ago

I made an issue specific for missing clinical data on patient view: https://github.com/cBioPortal/cbioportal/issues/2831. This is in current deployment, so should be a hotfix

tmazor commented 7 years ago

I guess this has turned out to be several different issues. I just made another issue for the missing data since I found another TCGA study where none of the samples have a cancer type.

inodb commented 7 years ago

I made another issue for the OncoKB annotation not showing up when the clinical data is missing for a sample: https://github.com/cBioPortal/cbioportal/issues/2863. We'll have to discuss what desired behavior is there. Closing this issue for now, since we split this one up into multiple smaller ones.