Closed ValWood closed 1 year ago
uniprot displays structures eg https://www.uniprot.org/uniprot/Q6ZW76
btw I don't find http://biojs.io/d/bio-pv useful I would want to be able to zoom in to residue detail so that I see molecular detail, the linked ribbon is just a pretty picture
I was thinking something like this.
There would be very minimal functionality (if any) on the gene page... but if you click on the icon (or some obvious action icon) you would go to a full launch page where you could do everything you wanted to do.
The genome broswer is easy, you would just go to the browser.
I like it!
-- Antonia Lock, PhD PomBase Biocurator, http://www.pombase.org Department of Genetics, Evolution and Environment, The Darwin Building, University College London London WC1E 6BT, UK
We could begin with Genome browser and Structure viewer, then add pathways later. It could also be a 4-way with the protein viewer eventually.... but then we might need to jiggle with the basic information....
See also: #988
UniProt use LiteMol: https://webchemdev.ncbr.muni.cz/LiteMol/
I've had a look and I think we could implement it for PomBase. We'd need to add the PDB entry IDs to Chado. I'm sure we can bulk download them somehow using the UniProt IDs.
This page might be useful: https://www.ebi.ac.uk/pdbe/pdb-component-library/doc.html
I'm sure we can bulk download them somehow using the UniProt IDs.
yes that should be possible. Some will have more than one structure (probably), and often have different "chains" broken up as separate entries. Some might only have the structure as part of a larger complex. We will probably only want to select one entry, so we may need some heuristic to select the best one to display once we know what is available.
so we may need some heuristic to select the best one
That sounds tricky. :-)
On the UniProt pages they have a selector on the right the lets the user select which structure to show. Should we do that too?
well it would be tricky to do space-wise if we plan to have the structure viewer in the top panel, it would take up quite a bit of space.
I might work if we have tabs like this: https://github.com/pombase/website/issues/988#issuecomment-430496265
that looks amazing!
Yes I really like it too. I thought I commented before...
I'm putting this as "next grant" because its totally non-urgent. However it would be really nice so if you fancy a diversion from Canto at any point....
so we may need some heuristic to select the best one
That sounds tricky. :-)
There are probably only 2 things we are interested in, coverage and resolution. Perhaps we can just select the structure with the best resolutions which covers all "chains". It's not a huge deal if we do not get the best one every single time.
I'm not sure that we have so many structures. I found browsing the pdb site that there is a species filter, but it has a cut-off at ~500 structures. Fission yeast did not make it into that list.
A rough (high level) list of tasks ~Download all pombe pdb structures~ we will just show alpha fold by default Download all pombe Alpha Fold structures https://alphafold.ebi.ac.uk/download Decide between JSMol and LiteMol (these are probably even versions of the same thing?) ~Decide how to display. Probably best to include all EXP structures so maybe just default the highest resolution if there are more than one, and have tabs for others including Alpha fold~
Download all pombe pdb structures Download all pombe Alpha Fold structures
I'm hoping we don't need to download the structures once we have the PDB IDs to associate with the pombe genes. The LiteMol docs imply that it can use these APIs to get the structures on demand from PDB:
Even better!
Do you know (or could you ask) which tool UniProt use for displaying Alpha Fold and other pdb structures?
Aurelien says: It's using Mol* https://molstar.org/ embedded within our own Nightingale components https://ebi-webcomponents.github.io/nightingale/#/structure
also might be usefule. Uniprot curate active sites, modified amino acids, metal binding sites + propagate between orthologs. I don't think that is integrated
Thanks! We have been discussing the UniPRot active site curation recently. InterMine will pull these into the pombemine so they could become available....
If you do need structures, I asked rscb last night if you could get them by species. Their response:
You can browse by species at https://www.rcsb.org/search/browse/taxonomy After selecting the blue link in the branch of the tree that you are interested in, you can select the "Download Files" link at the top right of the results page. If there are many entries you will probably want to use the batch script at https://www.rcsb.org/docs/programmatic-access/batch-downloads-with-shell-script
There must be a web service for embedding Alpha fold structures?
if so we could add these, and then implement your snazzy https://github.com/pombase/website/issues/988
Maybe you cant do this directly. This paper mentions a number of the available viewers. I don't know if this is useful.
I like the viewer that AlphaFold pages use, but I can't tell what it is.
I like the viewer that AlphaFold pages use, but I can't tell what it is.
I had a look. It's Mol*: https://molstar.org/
I like that one. what are the options? Presumably this is a simpler tasks now that all we plan to do is display the AlphaFOLd and link to pdb in the main gene page structure viewer. We can, later use the same, or a different viewer linked to the protein structure viewer as necessary...
I had a look. It's Mol*: https://molstar.org/
I've re-read the thread and seen that Antonia had already said that. Sorry Antonia!
I like that one. what are the options?
What sort of options do you mean?
Here's an example of a minimal structure viewer using PDB's wrapper for Molstar: https://plnkr.co/edit/v0guJYMkyAIb43GT?preview
An example with an AlphaFold structure: https://plnkr.co/edit/Z2BGbGMwr9qa1nQu?preview
Sorry I meant what are the choices
An example with the AlphaFold structure: https://plnkr.co/edit/Z2BGbGMwr9qa1nQu?preview
Looks good. Exciting!
Thinking (way) ahead. We have an open item to create complex pages. People are doing nice complex structure predictions using Alphafold, so we could represent these on the 'complex' pages. This would be super useful (obviously you also get complexes for structures, but it's very spotty...
@manulera in case you are not following..
presumably this is Molstar
Their website says:
Mol* development was jointly initiated by PDBe and RCSB PDB to combine and build on the strengths of LiteMol (developed by PDBe) and NGL (developed by RCSB PDB) viewers.
I found a few other protein structure viewers:
But I think it makes sense to use the same viewer as AlphaFold and PDB.
Yes no point in looking at any others, everyone is familiar with this one and it will always be well supported.
Yes! That one looks very nice.
Prototype! https://desktop.kmr.nz/gene/SPAPB1A10.15
I've removed JBrowse temporarily in this version while I work on the structure view.
The "View full Q9HDX5 Alphafold page" text is just a placeholder.
This is going to be so awesome....... you will be the most popular person at the meeting in Japan ;)
Will need a way to deal with entries with no alpha fold. There will be entries with no prediction, but also entries with no uniprot accession number https://desktop.kmr.nz/gene/SPAC110.06 (dubious proteins are excluded from uniprot)
There will be entries with no prediction
I wonder if we can download a list of IDs with predictions from AlphaFold? I'll check that soon.
but also entries with no uniprot accession number
I'll fix that by not trying to show the structure if there's no UniProt ID for the gene.
On Monday I'll put a prototype in dev.pombase.kmr.nz.
I've done that: http://dev.pombase.kmr.nz/gene/SPCC18B5.03
The main site is unchanged for now.
On my desktop I'm working on adding buttons. We should chat on Zoom about how that should work.
On my desktop I'm working on adding buttons.
Here's my current work in progress: https://desktop.kmr.nz/gene/SPCC18B5.03 (That link will work until noon or so)
Looks fab.
I think we should aim to make the windows a similar size.
some thoughts: We should reduce the width, and increase the depth of the genome browser window so that it is the same aspect as the structure. It is set at the current width to cope with dynein at this resolution.
Increasing the depth will improve the scroll bar bug which means that often forward stran features are hidden by default.
I suggest also
Make the window width 14k which would show all of dynein but only a tiny proportion of the flanking region (at the same resolution) then zoom the view resolution until it fits the width fits. Some views will be more crowded but this is really only for orientation, and will encourage people to go to the browser (the only other option is to change the scale in a gene-dependent way, which would be confusing If it looks very squashed we can make Alphafold a bit wider.
As you suggested move the 'flanking genes' to only display on the browser view (I. like the Alpha fold external links). With the above changes, there will be fewer flanking genes.
As you suggested move the 'flanking genes' to only display on the browser view
I meant that we could have it at the top in all cases. I think it's more useful when JBrowse isn't visible.
Increasing the depth will improve the scroll bar bug which means that often forward stran features are hidden by default.
I agree with this. Even if I know it, sometimes it is confusing that I don't immediatelly see the gene from the gene page in JBrowse.
What do you mean by hiding the flanking genes?
We should reduce the width, and increase the depth of the genome browser window so that it is the same aspect as the structure.
Does it need to be the same width? We can only see one widget at a time.
I meant that we could have it at the top in all cases. I think it's more useful when JBrowse isn't visible.
good point.
Maybe we could just have a link to the 5'flanking protein and the 3'flanking protein either side (it could even be in the summary...?)
Does it need to be the same width? We can only see one widget at a time.
They don't absolutely need to be the same width, but it would look slicker.
Also I think it could be a slight improvement for the genome browser view to see more depth and fewer flanking genes.
re "hiding flanking gnes" refers to this list:
I've made JBrowse the same height as the structure view on my desktop: https://desktop.kmr.nz/gene/SPAC1093.06c
They don't absolutely need to be the same width, but it would look slicker.
In that case could we make the structure viewer as wide as JBrowse?
Also I think it could be a slight improvement for the genome browser view to see more depth and fewer flaking genes.
We can do that without making making it narrower though.
Lets chat about the aspect ratios tomorrow. It seems there are quite a few variables that could be adjusted.
Interestingly AlphaFold doesn't have a prediction for dynein heavy chain - I wonder if it struggles with 4000 AA proteins???
Interestingly AlphaFold doesn't have a prediction for dynein heavy chain - I wonder if it struggles with 4000 AA proteins???
Yep: https://alphafold.ebi.ac.uk/faq#faq-19
"The minimum length is 16 amino acids, while the maximum is 2,700 for proteomes / Swiss-Prot and 1,280 for the rest of UniProt. For the human proteome only, our download includes longer proteins segmented into fragments."
So unfortunately no prediction for this one either: https://www.pombase.org/gene/SPBPJ4664.02
http://www.rcsb.org/pdb/software/wsreport.do You can search and then create a custom report: https://www.rcsb.org/docs/programmatic-access/web-services-overview e.g https://www.rcsb.org/3d-view/jsmol/2GFU/1