pombase / website

PomBase website v2
MIT License
6 stars 1 forks source link

Display protein structures and change top page matter #98

Closed ValWood closed 1 year ago

ValWood commented 7 years ago

http://www.rcsb.org/pdb/software/wsreport.do You can search and then create a custom report: https://www.rcsb.org/docs/programmatic-access/web-services-overview e.g https://www.rcsb.org/3d-view/jsmol/2GFU/1

Antonialock commented 6 years ago

uniprot displays structures eg https://www.uniprot.org/uniprot/Q6ZW76

Antonialock commented 6 years ago

btw I don't find http://biojs.io/d/bio-pv useful I would want to be able to zoom in to residue detail so that I see molecular detail, the linked ribbon is just a pretty picture

ValWood commented 6 years ago

I was thinking something like this.

mockup browsers on gene page

There would be very minimal functionality (if any) on the gene page... but if you click on the icon (or some obvious action icon) you would go to a full launch page where you could do everything you wanted to do.

The genome broswer is easy, you would just go to the browser.

Antonialock commented 6 years ago

I like it!

-- Antonia Lock, PhD PomBase Biocurator, http://www.pombase.org Department of Genetics, Evolution and Environment, The Darwin Building, University College London London WC1E 6BT, UK

ValWood commented 6 years ago

We could begin with Genome browser and Structure viewer, then add pathways later. It could also be a 4-way with the protein viewer eventually.... but then we might need to jiggle with the basic information....

kimrutherford commented 6 years ago

See also: #988

kimrutherford commented 6 years ago

UniProt use LiteMol: https://webchemdev.ncbr.muni.cz/LiteMol/

I've had a look and I think we could implement it for PomBase. We'd need to add the PDB entry IDs to Chado. I'm sure we can bulk download them somehow using the UniProt IDs.

This page might be useful: https://www.ebi.ac.uk/pdbe/pdb-component-library/doc.html

e.g page https://www.uniprot.org/uniprot/Q6ZW76

ValWood commented 6 years ago

I'm sure we can bulk download them somehow using the UniProt IDs.

yes that should be possible. Some will have more than one structure (probably), and often have different "chains" broken up as separate entries. Some might only have the structure as part of a larger complex. We will probably only want to select one entry, so we may need some heuristic to select the best one to display once we know what is available.

kimrutherford commented 6 years ago

so we may need some heuristic to select the best one

That sounds tricky. :-)

On the UniProt pages they have a selector on the right the lets the user select which structure to show. Should we do that too?

uniprot-structure-1

kimrutherford commented 6 years ago

well it would be tricky to do space-wise if we plan to have the structure viewer in the top panel, it would take up quite a bit of space.

I might work if we have tabs like this: https://github.com/pombase/website/issues/988#issuecomment-430496265

Antonialock commented 6 years ago

that looks amazing!

ValWood commented 6 years ago

Yes I really like it too. I thought I commented before...

ValWood commented 5 years ago

I'm putting this as "next grant" because its totally non-urgent. However it would be really nice so if you fancy a diversion from Canto at any point....

ValWood commented 4 years ago

so we may need some heuristic to select the best one

That sounds tricky. :-)

There are probably only 2 things we are interested in, coverage and resolution. Perhaps we can just select the structure with the best resolutions which covers all "chains". It's not a huge deal if we do not get the best one every single time.

I'm not sure that we have so many structures. I found browsing the pdb site that there is a species filter, but it has a cut-off at ~500 structures. Fission yeast did not make it into that list.

ValWood commented 3 years ago

A rough (high level) list of tasks ~Download all pombe pdb structures~ we will just show alpha fold by default Download all pombe Alpha Fold structures https://alphafold.ebi.ac.uk/download Decide between JSMol and LiteMol (these are probably even versions of the same thing?) ~Decide how to display. Probably best to include all EXP structures so maybe just default the highest resolution if there are more than one, and have tabs for others including Alpha fold~

kimrutherford commented 3 years ago

Download all pombe pdb structures Download all pombe Alpha Fold structures

I'm hoping we don't need to download the structures once we have the PDB IDs to associate with the pombe genes. The LiteMol docs imply that it can use these APIs to get the structures on demand from PDB:

ValWood commented 3 years ago

Even better!

Antonialock commented 3 years ago

Do you know (or could you ask) which tool UniProt use for displaying Alpha Fold and other pdb structures?

Aurelien says: It's using Mol* https://molstar.org/ embedded within our own Nightingale components https://ebi-webcomponents.github.io/nightingale/#/structure

Antonialock commented 3 years ago

also might be usefule. Uniprot curate active sites, modified amino acids, metal binding sites + propagate between orthologs. I don't think that is integrated

ValWood commented 3 years ago

Thanks! We have been discussing the UniPRot active site curation recently. InterMine will pull these into the pombemine so they could become available....

ValWood commented 3 years ago

If you do need structures, I asked rscb last night if you could get them by species. Their response:

You can browse by species at https://www.rcsb.org/search/browse/taxonomy After selecting the blue link in the branch of the tree that you are interested in, you can select the "Download Files" link at the top right of the results page. If there are many entries you will probably want to use the batch script at https://www.rcsb.org/docs/programmatic-access/batch-downloads-with-shell-script

ValWood commented 2 years ago

There must be a web service for embedding Alpha fold structures?

if so we could add these, and then implement your snazzy https://github.com/pombase/website/issues/988

ValWood commented 2 years ago

Maybe you cant do this directly. This paper mentions a number of the available viewers. I don't know if this is useful.

I like the viewer that AlphaFold pages use, but I can't tell what it is.

kimrutherford commented 1 year ago

I like the viewer that AlphaFold pages use, but I can't tell what it is.

I had a look. It's Mol*: https://molstar.org/

ValWood commented 1 year ago

I like that one. what are the options? Presumably this is a simpler tasks now that all we plan to do is display the AlphaFOLd and link to pdb in the main gene page structure viewer. We can, later use the same, or a different viewer linked to the protein structure viewer as necessary...

kimrutherford commented 1 year ago

I had a look. It's Mol*: https://molstar.org/

I've re-read the thread and seen that Antonia had already said that. Sorry Antonia!

I like that one. what are the options?

What sort of options do you mean?

kimrutherford commented 1 year ago

Here's an example of a minimal structure viewer using PDB's wrapper for Molstar: https://plnkr.co/edit/v0guJYMkyAIb43GT?preview

kimrutherford commented 1 year ago

An example with an AlphaFold structure: https://plnkr.co/edit/Z2BGbGMwr9qa1nQu?preview

ValWood commented 1 year ago

Sorry I meant what are the choices

  1. Molstar
  2. UniProt use LiteMol: https://webchemdev.ncbr.muni.cz/LiteMol/ (presumably this is Molstar) It sounds like Molstar is the one to go for...

An example with the AlphaFold structure: https://plnkr.co/edit/Z2BGbGMwr9qa1nQu?preview

Looks good. Exciting!

ValWood commented 1 year ago

Thinking (way) ahead. We have an open item to create complex pages. People are doing nice complex structure predictions using Alphafold, so we could represent these on the 'complex' pages. This would be super useful (obviously you also get complexes for structures, but it's very spotty...

ValWood commented 1 year ago

@manulera in case you are not following..

kimrutherford commented 1 year ago

presumably this is Molstar

Their website says:


Mol* development was jointly initiated by PDBe and RCSB PDB to combine and build on the strengths of LiteMol (developed by PDBe) and NGL (developed by RCSB PDB) viewers.


I found a few other protein structure viewers:

But I think it makes sense to use the same viewer as AlphaFold and PDB.

ValWood commented 1 year ago

Yes no point in looking at any others, everyone is familiar with this one and it will always be well supported.

manulera commented 1 year ago

Yes! That one looks very nice.

kimrutherford commented 1 year ago

Prototype! https://desktop.kmr.nz/gene/SPAPB1A10.15

I've removed JBrowse temporarily in this version while I work on the structure view.

The "View full Q9HDX5 Alphafold page" text is just a placeholder.

image

ValWood commented 1 year ago

This is going to be so awesome....... you will be the most popular person at the meeting in Japan ;)

ValWood commented 1 year ago

Will need a way to deal with entries with no alpha fold. There will be entries with no prediction, but also entries with no uniprot accession number https://desktop.kmr.nz/gene/SPAC110.06 (dubious proteins are excluded from uniprot)

kimrutherford commented 1 year ago

There will be entries with no prediction

I wonder if we can download a list of IDs with predictions from AlphaFold? I'll check that soon.

but also entries with no uniprot accession number

I'll fix that by not trying to show the structure if there's no UniProt ID for the gene.

kimrutherford commented 1 year ago

On Monday I'll put a prototype in dev.pombase.kmr.nz.

I've done that: http://dev.pombase.kmr.nz/gene/SPCC18B5.03

The main site is unchanged for now.

On my desktop I'm working on adding buttons. We should chat on Zoom about how that should work.

https://desktop.kmr.nz/gene/SPCC18B5.03

image

kimrutherford commented 1 year ago

On my desktop I'm working on adding buttons.

Here's my current work in progress: https://desktop.kmr.nz/gene/SPCC18B5.03 (That link will work until noon or so)

image

ValWood commented 1 year ago

Looks fab.

I think we should aim to make the windows a similar size.

some thoughts: We should reduce the width, and increase the depth of the genome browser window so that it is the same aspect as the structure. It is set at the current width to cope with dynein at this resolution.

Increasing the depth will improve the scroll bar bug which means that often forward stran features are hidden by default.

I suggest also

Make the window width 14k which would show all of dynein but only a tiny proportion of the flanking region (at the same resolution) then zoom the view resolution until it fits the width fits. Some views will be more crowded but this is really only for orientation, and will encourage people to go to the browser (the only other option is to change the scale in a gene-dependent way, which would be confusing If it looks very squashed we can make Alphafold a bit wider.

As you suggested move the 'flanking genes' to only display on the browser view (I. like the Alpha fold external links). With the above changes, there will be fewer flanking genes.

kimrutherford commented 1 year ago

As you suggested move the 'flanking genes' to only display on the browser view

I meant that we could have it at the top in all cases. I think it's more useful when JBrowse isn't visible.

manulera commented 1 year ago

Increasing the depth will improve the scroll bar bug which means that often forward stran features are hidden by default.

I agree with this. Even if I know it, sometimes it is confusing that I don't immediatelly see the gene from the gene page in JBrowse.

What do you mean by hiding the flanking genes?

kimrutherford commented 1 year ago

We should reduce the width, and increase the depth of the genome browser window so that it is the same aspect as the structure.

Does it need to be the same width? We can only see one widget at a time.

ValWood commented 1 year ago

I meant that we could have it at the top in all cases. I think it's more useful when JBrowse isn't visible.

good point.

Maybe we could just have a link to the 5'flanking protein and the 3'flanking protein either side (it could even be in the summary...?)

ValWood commented 1 year ago

Does it need to be the same width? We can only see one widget at a time.

They don't absolutely need to be the same width, but it would look slicker.

Also I think it could be a slight improvement for the genome browser view to see more depth and fewer flanking genes.

ValWood commented 1 year ago

re "hiding flanking gnes" refers to this list:

Screenshot 2023-01-23 at 10 23 25
kimrutherford commented 1 year ago

I've made JBrowse the same height as the structure view on my desktop: https://desktop.kmr.nz/gene/SPAC1093.06c

They don't absolutely need to be the same width, but it would look slicker.

In that case could we make the structure viewer as wide as JBrowse?

Also I think it could be a slight improvement for the genome browser view to see more depth and fewer flaking genes.

We can do that without making making it narrower though.

ValWood commented 1 year ago

Lets chat about the aspect ratios tomorrow. It seems there are quite a few variables that could be adjusted.

Interestingly AlphaFold doesn't have a prediction for dynein heavy chain - I wonder if it struggles with 4000 AA proteins???

kimrutherford commented 1 year ago

Interestingly AlphaFold doesn't have a prediction for dynein heavy chain - I wonder if it struggles with 4000 AA proteins???

Yep: https://alphafold.ebi.ac.uk/faq#faq-19

"The minimum length is 16 amino acids, while the maximum is 2,700 for proteomes / Swiss-Prot and 1,280 for the rest of UniProt. For the human proteome only, our download includes longer proteins segmented into fragments."

So unfortunately no prediction for this one either: https://www.pombase.org/gene/SPBPJ4664.02