reimandlab / ActiveDriverDB

ActiveDriverDB
GNU Lesser General Public License v2.1
12 stars 3 forks source link

DNA2protein mismatches #85

Closed reimand0 closed 7 years ago

reimand0 commented 7 years ago

I tried to map mutations to PTMs here - https://rl-db.oicr.on.ca/search/mutations

this query links to BCL2L11, however it should give TP53 instead. 17 7572934 G A

krassowski commented 7 years ago

I confirm the problem. I checked some thinks and everything in code looks right; I guess it's not the code issue but just human factor - my fault. The MySQL database has been rebuild recently but the mappings database (bsddb) not; the second one uses identifiers from the first one and they have changed. I will rebuild mappings now - if I remember correctly it takes about 2 days. I apologize for that. I will mention this issue in import procedures to prevent it from happening again.

As to support the claim above, line corresponding to given genomic mutation:

annot_121.txt.gz:chr17    7572934   g   a   TP53:NM_001126115:exon7:c.C779T:p.S260L,TP53:NM_001276761:exon11:c.C1058T:p.S353L,TP53:NM_001126112:exon11:c.C1175T:p.S392L,TP53:NM_001276760:exon11:c.C1058T:p.S353L,TP53:NM_001276697:exon7:c.C698T:p.S233L,TP53:NM_000546:exon11:c.C1175T:p.S392L,TP53:NM_001126118:exon10:c.C1058T:p.S353L,

has S→L protein mutations, the same as the results for "chr17 7572934 G A"

I am not ruling out some bugs in code but it seems to be the mentioned problem.

PS. The import is ongoing and shows 11 hours & 44 minutes to go, maybe it won't take 2 days this time.

reimand0 commented 7 years ago

sounds good, thanks! always good to check things like that - perhaps we should set up a suite of tests.

On Wed, Nov 16, 2016 at 3:17 PM krassowski notifications@github.com wrote:

I confirm the problem. I checked some thinks and everything in code looks right; I guess it's not the code issue but just human factor - my fault. The MySQL database has been rebuild recently but the mappings database (bsddb) not; the second one uses identifiers from the first one and they have changed. I will rebuild mappings now - if I remember correctly it takes about 2 days. I apologize for that. I will mention this issue in import procedures to prevent it from happening again.

As to support the claim above, line corresponding to given genomic mutation:

annot_121.txt.gz:chr17 7572934 g a TP53:NM_001126115:exon7:c.C779T:p.S260L,TP53:NM_001276761:exon11:c.C1058T:p.S353L,TP53:NM_001126112:exon11:c.C1175T:p.S392L,TP53:NM_001276760:exon11:c.C1058T:p.S353L,TP53:NM_001276697:exon7:c.C698T:p.S233L,TP53:NM_000546:exon11:c.C1175T:p.S392L,TP53:NM_001126118:exon10:c.C1058T:p.S353L,

has S→L protein mutations, the same as the results for "chr17 7572934 G A"

I am not rolling out some bugs in code but it seems to be the mentioned problem.

PS. The import is ongoing and shows 11 hours & 44 minutes to go, maybe it won't take 2 days this time.

— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub https://github.com/reimandlab/Visualisation-Framework-for-Genome-Mutations/issues/85#issuecomment-261059420, or mute the thread https://github.com/notifications/unsubscribe-auth/ASYC_emrd4lV_W9bxwSm5NWjhr_WXyLIks5q-2TzgaJpZM4KzLSm .

krassowski commented 7 years ago

Just to give an update: after more than 42 hours of import we have 370/402 (92%) files imported and the system estimates that it will take 5 hours and 33 minutes more.

krassowski commented 7 years ago

After reimport it works as expected (tested with mentioned query "chr17 7572934 G A"). As we discussed I will write some automated tests to avoid similar situation in the future.

Just for future: whole reimport took 48:20:01 - two days, 20 minutes and one second.

krassowski commented 7 years ago

The problem has been fixed and tests has been written.