reimandlab / ActiveDriverDB

ActiveDriverDB
GNU Lesser General Public License v2.1
12 stars 3 forks source link

protein sequence view #98

Closed reimand0 closed 6 years ago

reimand0 commented 7 years ago

some minor additions to make this better:

  1. add description of gene/protein after it's name or refseq ID - for TP53, this would be "tumor protein p53 ". https://www.ncbi.nlm.nih.gov/gene/7157

  2. Move "mutations visualisation" box above "Protein summary" box.

  3. The red bar "38.58% of sequence is predicted to be disordered" is nice, but takes a lot of space. Can be shortened.

reimand0 commented 7 years ago

besides (1), does your input database include a small text-based description of the protein (perhaps 2-5 sentences, maybe a bit longer)?

krassowski commented 7 years ago
  1. Currently we have no such description imported into database. I can add both short and long descriptions. Do you have a particular database/file which provides descriptions in mind? I think that we can use ftp://ftp.ncbi.nlm.nih.gov/gene/DATA/gene_info.gz if we need only the short description and ftp://ftp.ncbi.nih.gov/refseq/H_sapiens/RefSeqGene/ gbff files if we need also the summary text.
  2. Done
  3. I removed spacing below the bar. Is that all right now?
reimand0 commented 6 years ago

Thanks Michal. Both long and short descriptions would be useful for those who browse and end up at an interesting gene.

This page may be useful: https://www.biostars.org/p/2144/ Also, something can be retrieved from Ensembl grc37 and its Biomart http://grch37.ensembl.org/biomart/martview/, however the less gene IDs we need to convert the better/. Conversions almost always cause losses or multiple matches.

On Wed, Jun 28, 2017 at 1:45 AM krassowski notifications@github.com wrote:

  1. Currently we have no such description imported into database. I can add both short and long descriptions. Do you have a particular database/file which provides descriptions in mind? I think that we can use ftp://ftp.ncbi.nlm.nih.gov/gene/DATA/gene_info.gz if we need only the short description and ftp://ftp.ncbi.nih.gov/refseq/H_sapiens/RefSeqGene/ gbff files if we need also the summary text.
  2. Done
  3. I removed spacing below the bar. Is that all right now?

— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub https://github.com/reimandlab/ActiveDriverDB/issues/98#issuecomment-311562339, or mute the thread https://github.com/notifications/unsubscribe-auth/ASYC_VvWDXDyIzl7CDjQa7QovxV9FEFXks5sIehdgaJpZM4OGB84 . [image: Attach media][image: Screen Shot 2017-07-06 at 12.31.43 PM.png]

krassowski commented 6 years ago

Done. We have:

For 72 proteins there are no full protein names; for 35 of those full gene names exist and are displayed instead. Those are 35 genes with proteins where we cannot display any full name:

LOC650293
ABHD18
DLGAP2
CT45A4
LOC286238
PRAMEF16
CLLU1
KRT86
MFSD2B
C1orf204
USP36
POU6F1
GDNF-AS1
TEX15
C15orf54
SH2D6
LOC643802
WI2-2373I1.2
FBF1
PRR21
MIR205HG
FAM157A
C10orf12
DSCR4
WDR49
LOC643355
RUSC1-AS1
LOC653602
ZNF812
FAM157B
ZNF654
PRSS46
SNX25
ATP6V1E2
LINC00649
reimand0 commented 6 years ago

Thanks! partial overlap is somewhat expected .

On Tue, Jul 11, 2017 at 8:59 AM krassowski notifications@github.com wrote:

Done. We have:

  • full gene names imported for 18 957 genes (95% of 19 786 total in ADDB).
  • full protein names for 39 087 proteins out of 39 159 (99.8%)
  • protein summaries for 27 409 proteins (70%)

For 72 proteins there are no full protein names; for 35 of those full gene names exist and are displayed instead. Those are 35 genes with proteins where we cannot display any full name:

LOC650293 ABHD18 DLGAP2 CT45A4 LOC286238 PRAMEF16 CLLU1 KRT86 MFSD2B C1orf204 USP36 POU6F1 GDNF-AS1 TEX15 C15orf54 SH2D6 LOC643802 WI2-2373I1.2 FBF1 PRR21 MIR205HG FAM157A C10orf12 DSCR4 WDR49 LOC643355 RUSC1-AS1 LOC653602 ZNF812 FAM157B ZNF654 PRSS46 SNX25 ATP6V1E2 LINC00649

— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub https://github.com/reimandlab/ActiveDriverDB/issues/98#issuecomment-314435421, or mute the thread https://github.com/notifications/unsubscribe-auth/ASYC_Uth7oK-jfDG4o-_up6jrOaKOLsEks5sM3G0gaJpZM4OGB84 .