reimandlab / ActiveDriverDB

ActiveDriverDB
GNU Lesser General Public License v2.1
12 stars 3 forks source link

incorrect cancer types in API #91

Closed reimand0 closed 7 years ago

reimand0 commented 7 years ago

Seems like tissue types provided by JSON are shown incorrectly. Marta analysed thousands of breast cancer mutations and they were mapped to various types (ovarian, brain, lung). I tried a few examples and arrived at the same conclusion.

https://rl-db.oicr.on.ca/chromosome/mutation/chr10/116247760/T/C

This mutation is from the TCGA MAF file of breast adenocarcinoma.

ABLIM1 0 - 37 10 116247760 116247760 + Missense_Mutation SNP T T C TCGA-A1-A0SB-01A-11D-A142-09 TCGA-A1-A0SB-10B-01D-A142-09

The barcode has A1 in the 2nd position, meaning it is a breast cancer sample according to the table from TCGA https://wiki.nci.nih.gov/pages/viewpage.action?pageId=29557833

Does this affect only JSON or other parts of the database?

krassowski commented 7 years ago

It would be very serious bug. OK, so we have a row: 10 116247760 116247760 T C exonic ABLIM1 . nonsynonymous SNV ABLIM1:NM_006720:exon3:c.A50G:p.H17R,ABLIM1:NM_001003407:exon8:c.A818G:p.H273R,ABLIM1:NM_001003408:exon8:c.A818G:p.H273R,ABLIM1:NM_002313:exon8:c.A998G:p.H333R comments: Ovarian serous cystadenocarcinoma;TCGA-A1-A0SB-01A-11D-A142-09;ABLIM1 in our TCGA_muts_annotated.txt file.

If I understand correctly, I assumed wrongly that the annotation comment "Ovarian serous cystadenocarcinoma;TCGA-A1-A0SB-01A-11D-A142-09;ABLIM1" indicates that a given mutation is related to cancer type "Ovarian serous cystadenocarcinoma" BUT instead I should check the code from sample id, yes?

I'm terribly sorry for that, but I think it was quite tempting to assume the name from the comment field indicates cancer type - life would be so easy ;)

reimand0 commented 7 years ago

Sorry, that was my mistake. Uploaded new file TCGA_muts_annotated.txt.gz into dropbox.

On Fri, Dec 2, 2016 at 1:50 AM krassowski notifications@github.com wrote:

It would be very serious bug. OK, so we have a row: 10 116247760 116247760 T C exonic ABLIM1 . nonsynonymous SNV ABLIM1:NM_006720:exon3:c.A50G:p.H17R,ABLIM1:NM_001003407:exon8:c.A818G:p.H273R,ABLIM1:NM_001003408:exon8:c.A818G:p.H273R,ABLIM1:NM_002313:exon8:c.A998G:p.H333R comments: Ovarian serous cystadenocarcinoma;TCGA-A1-A0SB-01A-11D-A142-09;ABLIM1 in our TCGA_muts_annotated.txt file.

If I understand correctly, I assumed wrongly that the annotation comment "Ovarian serous cystadenocarcinoma;TCGA-A1-A0SB-01A-11D-A142-09;ABLIM1" indicates that a given mutation is related to cancer type "Ovarian serous cystadenocarcinoma" BUT instead I should check the code from sample id, yes?

I'm terribly sorry for that, but I think it was quite tempting to assume the name from the comment field indicates cancer type - life would be so easy ;)

— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub https://github.com/reimandlab/Visualisation-Framework-for-Genome-Mutations/issues/91#issuecomment-264386479, or mute the thread https://github.com/notifications/unsubscribe-auth/ASYC_W9oQzK32hhze_ZmqW6lIodlgUJFks5rD7_FgaJpZM4LB3G6 .

krassowski commented 7 years ago

I have reimported the TCGA mutations data, it seems to be working.