reimandlab / ActiveDriverDB

ActiveDriverDB
GNU Lesser General Public License v2.1
12 stars 3 forks source link

undefined chr mutation bug #131

Closed reimand0 closed 6 years ago

reimand0 commented 6 years ago

searching for chr17 7673776 C T

https://activedriverdb.org/protein/show/undefined

reimand0 commented 6 years ago

I am trying to create an example of this chr>protein mapping and started from TP53 R282W. It should be "chr17 7577094 C T" according to this site, https://www.ncbi.nlm.nih.gov/clinvar/variation/12364/

We need it for two things, first as an example on front page, and second, for the manuscript.

krassowski commented 6 years ago

This is a strand issue. This query works: chr17 7577094 G A

krassowski commented 6 years ago

The corresponding rows in annotations file are:

chr17     7577094       g       a       TP53:NM_001126115:exon4:c.C448T:p.R150W,TP53:NM_001276761:exon8:c.C727T:p.R243W,TP53:NM_001126113:exon8:c.C844T:p.R282W,TP53:NM_001276695:exon8:c.C727T:p.R243W,TP53:NM_001126112:exon8:c.C844T:p.R282W,TP53:NM_001276760:exon8:c.C727T:p.R243W,TP53:NM_001126116:exon4:c.C448T:p.R150W,TP53:NM_001276696:exon8:c.C727T:p.R243W,TP53:NM_001276698:exon4:c.C367T:p.R123W,TP53:NM_001126114:exon8:c.C844T:p.R282W,TP53:NM_001276697:exon4:c.C367T:p.R123W,TP53:NM_001276699:exon4:c.C367T:p.R123W,TP53:NM_000546:exon8:c.C844T:p.R282W,TP53:NM_001126118:exon7:c.C727T:p.R243W,TP53:NM_001126117:exon4:c.C448T:p.R150W,
chr17     7577094       g       a       TP53:NM_001126115:exon4:c.C448T:p.R150W,TP53:NM_001276761:exon8:c.C727T:p.R243W,TP53:NM_001126113:exon8:c.C844T:p.R282W,TP53:NM_001276695:exon8:c.C727T:p.R243W,TP53:NM_001126112:exon8:c.C844T:p.R282W,TP53:NM_001276760:exon8:c.C727T:p.R243W,TP53:NM_001126116:exon4:c.C448T:p.R150W,TP53:NM_001276696:exon8:c.C727T:p.R243W,TP53:NM_001276698:exon4:c.C367T:p.R123W,TP53:NM_001126114:exon8:c.C844T:p.R282W,TP53:NM_001276697:exon4:c.C367T:p.R123W,TP53:NM_001276699:exon4:c.C367T:p.R123W,TP53:NM_000546:exon8:c.C844T:p.R282W,TP53:NM_001126118:exon7:c.C727T:p.R243W,TP53:NM_001126117:exon4:c.C448T:p.R150W,
chr17     7577094       g       a       TP53:NM_001276697:exon4:c.C367T:p.R123W,TP53:NM_001276698:exon4:c.C367T:p.R123W,TP53:NM_000546:exon8:c.C844T:p.R282W,TP53:NM_001126118:exon7:c.C727T:p.R243W,TP53:NM_001126116:exon4:c.C448T:p.R150W,TP53:NM_001126115:exon4:c.C448T:p.R150W,TP53:NM_001126114:exon8:c.C844T:p.R282W,TP53:NM_001126117:exon4:c.C448T:p.R150W,TP53:NM_001276695:exon8:c.C727T:p.R243W,TP53:NM_001276696:exon8:c.C727T:p.R243W,TP53:NM_001276699:exon4:c.C367T:p.R123W,TP53:NM_001276761:exon8:c.C727T:p.R243W,TP53:NM_001126113:exon8:c.C844T:p.R282W,TP53:NM_001276760:exon8:c.C727T:p.R243W,TP53:NM_001126112:exon8:c.C844T:p.R282W,
chr17     7577094       g       a       TP53:NM_001276761:exon8:c.C727T:p.R243W,TP53:NM_000546:exon8:c.C844T:p.R282W,TP53:NM_001126112:exon8:c.C844T:p.R282W,TP53:NM_001276695:exon8:c.C727T:p.R243W,TP53:NM_001276697:exon4:c.C367T:p.R123W,TP53:NM_001276696:exon8:c.C727T:p.R243W,TP53:NM_001126116:exon4:c.C448T:p.R150W,TP53:NM_001126113:exon8:c.C844T:p.R282W,TP53:NM_001126117:exon4:c.C448T:p.R150W,TP53:NM_001276760:exon8:c.C727T:p.R243W,TP53:NM_001126115:exon4:c.C448T:p.R150W,TP53:NM_001126118:exon7:c.C727T:p.R243W,TP53:NM_001276698:exon4:c.C367T:p.R123W,TP53:NM_001126114:exon8:c.C844T:p.R282W,TP53:NM_001276699:exon4:c.C367T:p.R123W,

Unfortunately there is no strand information here. We have strand data for genes so I can re-import mappings using complementary nucleotides for -1 strand. This might be easier than recreating annotations. Should I proceed?

Edit: my bad, of course there are strand information in cDNA records.

krassowski commented 6 years ago

I started re-import of mappings using nucleotides from cDNA. It will take about 16 hours to complete.

reimand0 commented 6 years ago

One option is to just offer 'did you mean XX' if the provided query is clearly on the opposite strand. What do you think?

Jüri

On Aug 15, 2017, at 12:06, krassowski notifications@github.com wrote:

I started re-import of mappings using nucleotides from cDNA. It will take about 16 hours to complete.

— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub, or mute the thread.

krassowski commented 6 years ago

Yes I though about this too.

krassowski commented 6 years ago

Import completed. The "chr17 7577094 C T" works now.

krassowski commented 6 years ago

Now both "chr17 7577094 C T" and "chr17 7577094 G A" works with the latter showing that the result came from "complement of chr17 7577094 G A".

Edit: should I add small "GRCh37" hint near the search box?

reimand0 commented 6 years ago

Thanks! Works for me as well. Two small requests remain:

  1. When no mutations are present, we receive a 404 error on this page - https://activedriverdb.org/protein/show/undefined
  2. We should add a chromosome-level example onto the main page under the search bar

On Wed, Aug 16, 2017 at 9:27 AM krassowski notifications@github.com wrote:

Now both "chr17 7577094 C T" and "chr17 7577094 G A" works with the latter showing that the result came from "complement of chr17 7577094 G A"

— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub https://github.com/reimandlab/ActiveDriverDB/issues/131#issuecomment-322770424, or mute the thread https://github.com/notifications/unsubscribe-auth/ASYC_V_q7b6foO14ntubl_A35GcoW3XDks5sYu42gaJpZM4O3qm2 .

krassowski commented 6 years ago
  1. Fixed with 63125f68b84b04cbed576777be202c7de807c256.
  2. Done.

I expanded placeholder text in the front page search bar to indicate that we accept genomic mutations in hg19/GRCh37 coordinates. Also I replaced commas with "or" to visually separate the examples and added "disease mutation in gene" example to indicate that the input accepts more than genes and mutations. I do not plan to add more examples into placeholder as it is already quite long.

reimand0 commented 6 years ago

looks like the text does not sometimes fit the bar:

screen shot 2017-08-21 at 9 41 06 pm

we can compact by just saying (hg19)