intermine / pombemine

0 stars 1 forks source link

Errors on gene summary pages (visualizations AND templates) #38

Closed ValWood closed 2 years ago

ValWood commented 2 years ago

On the 'gene summary' pages, an number of the visualisations are not configured: MSA viewer Protein visualizer JBrowse

All of the template are giving "invalid query"

@kimrutherford

(We may not display everything, but I will wait for these to be fixed before I attempt to configure the page)

ValWood commented 2 years ago

Many templates still look like this:

Screenshot 2022-05-18 at 09 36 43

@manulera @kimrutherford

ValWood commented 2 years ago

Hi @heralden should this issue be moved onto the bluejeans tracker?

heralden commented 2 years ago

~The templates issue is fixed in version 1.4.2. I will ask Daniela about upgrading it.~ DONE

The MSA viewer and JBrowse require a backend service, and it's probably trying to use humanmine and flymine's service which don't have data for s. pombe. Probably a similar problem with the protein visualizer. I recommend you disable these visualizations for now, unless you want to spend time to make them work (albeit there's some remaining work we want to do to make it easier to use a custom backend service for these).

ValWood commented 2 years ago

We don't need a Jbrowse insance (we already have one), I don't know if that can be linked of used instead? I don't think multiple sequence alignments are a priority.

I can't find an example of "protein visualizer" so I am not sure what it is.

What about the query templates? is a fix in progress for this? Or do I need to do something to the configuration?

ValWood commented 2 years ago

RE What about the query templates? is a fix in progress for this? Or do I need to do something to the configuration? ignore. I see your comment.

heralden commented 2 years ago

You could in theory set the jbrowse in Bluegenes to use your jbrowse service instead (https://github.com/intermine/bluegenes-jbrowse-tool/blob/e911fb3a3c6e107252d2c746c6f662822cb3ae7e/config.json#L21) but this is a bit tricky to do right now. You have to edit the config.json in the tools folder.

The protein visualizer gives you a 3D model of the protein using RCSB PDB. This requires uniprot accessions being present in pombemine's database and the PDB having data for these.

heralden commented 2 years ago

@ValWood I have updated pombemine bluegenes to 1.4.2 now, so the templates on the report page should work.

ValWood commented 2 years ago

The protein visualizer gives you a 3D model of the protein using RCSB PDB. This requires UniProt accessions being present in pombemine's database and the PDB having data for these.

We have the Uniprot accession so this should already be possible. If this is a configuration only @kimrutherford could you take a look?

@heralden Do you plan to also use/make the AlphaFold data accessible? The quality is phenomenal and for known structures the PDB species coverage will be very low, while AlphaFold covers most proteins in a species.

ValWood commented 2 years ago

You could in theory set the jbrowse in Bluegenes to use your jbrowse service instead (https://github.com/intermine/bluegenes-jbrowse-tool/blob/e911fb3a3c6e107252d2c746c6f662822cb3ae7e/config.json#L21) but this is a bit tricky to do right now. You have to edit the config.json in the tools folder.

@kimrutherford is this something you could do?

ValWood commented 2 years ago

Re @ValWood I have updated pombemine bluegenes to 1.4.2 now, so the templates on the report page should work.

None of the reports is now giving an error.

One thing that seems odd. Most reports are focused on the gene of interest, in this case cdc2: http://pombemine.rahtiapp.fi/pombemine/report/Gene/1072605

But the ortholog and disease sections report all genes, they are not constrained on cdc2. For example:

Screenshot 2022-05-26 at 22 04 51
kimrutherford commented 2 years ago

But the ortholog and disease sections report all genes, they are not constrained on cdc2.

Maybe that's because the constraint is off by default? Those two templates report all genes by default.

kimrutherford commented 2 years ago

We have the Uniprot accession so this should already be possible. If this is a configuration only @kimrutherford could you take a look?

The UniProt IDs are already loaded.

Sorry, I don't know how to configure it. I'm happy to have a go if someone could let me know where the documentation is.

is this something you could do?

Sorry I don't know how the JBrowse configuration works.

ValWood commented 2 years ago

Maybe that's because the constraint is off by default? Those two templates report all genes by default.

OK, I will probably suppress those templates (If I can...)

heralden commented 2 years ago

@heralden Do you plan to also use/make the AlphaFold data accessible? The quality is phenomenal and for known structures the PDB species coverage will be very low, while AlphaFold covers most proteins in a species.

I haven't looked into their public API but it seems NGL supports it (https://github.com/nglviewer/ngl/blob/b46c97355e2727d4f162fa27d13f803b05888785/examples/scripts/test/alphafold.js#L4) so I'll add a note to look into it next time we update the protein visualizer.

Sorry, I don't know how to configure it. I'm happy to have a go if someone could let me know where the documentation is.

@kimrutherford There is the tool documentation: http://intermine.org/im-docs/docs/webapp/tool-api/overview In short, you have to edit the config.json file inside the tool directory under BG's tool path. Unfortunately this isn't trivial when BG is running on the cloud. Do you have access to the CSC cloud? Since the BG container is minimal and doesn't have a shell, we'll need to create a new pod mounting the same PVC and edit it from there. Not sure if the current storage class supports being mounted from several pods.

ValWood commented 2 years ago

FOr us the visualization is not so urgent, I'm most concerned that the data is modelled correctly so we can close this for now.