geneontology / geneontology.github.io

Repository for storing GO documentation, directly available through the general GO site
http://geneontology.org
MIT License
6 stars 10 forks source link

SPARQL endpoint page http://sparql.geneontology.org #267

Closed lpalbou closed 3 years ago

lpalbou commented 3 years ago

The documentation page http://geneontology.org/docs/sparql/ has been completed but is not yet referenced (https://github.com/geneontology/geneontology.github.io/issues/269).

The SPARQL entry point described in the GO article is http://sparql.geneontology.org. Currently, this redirects to a generic blazegraph workbench page with no mention of GO or example.

This ticket is to deploy in production the new UI page for the GO SPARQL endpoint:

Screen Shot 2021-01-04 at 5 24 48 PM

Steps:


The following is just for reference as @cmungall would prefer a proxy to a GH page. If it was to be done with CF/S3/Route53, it could be done that way:

Screen Shot 2021-01-04 at 5 23 42 PM

Result on my test architecture (http://sparql.geneontology.xyz/): Screen Shot 2021-01-04 at 5 31 43 PM

It's important to use the correct bucket name, or you won't see that option for the routing. If you have additional difficulties, you could otherwise just create a CF distribution to that bucket and route the subdomain to that CF distribution.

pgaudet commented 3 years ago

Comment 1: I think the examples are wrong here ?

Right @vanaukenk @thomaspd ? Thanks, Pascale


Comment 2:

Are we set on using Hsap and Cele etc to describe species ? Again this is a non-standard, non-intuitive representation.


Comment 3:

In the Table describing the relations, you could simplify by removing the 'Description' link and making the relations themselves clickable ?


Comment 4:

'part_of' is a BFO term but is also present in RO - maybe 'occurs' in can also be added to RO, and in this case we may be able to claim we are using a single ontology ?

(that's it for now)

Thanks, Pascale

pgaudet commented 3 years ago

@vanaukenk @ukemi

lpalbou commented 3 years ago

Thanks for the feedback.

sequence specific DNA binding should be part_of DNA binding transcription factor activity, shouldn't it ?

I don't think that's right. First, I am not sure we allow an activity (sequence specific DNA binding) to be part of another activity (DNA binding transcription factor activity): go-cam-shapes. The usual pattern is more activity part of BP. Second, the specific binding to DNA indeed triggers the transcription factor activity; without it, there is no transcription, so I do see this as a causal relationship and part_of is not a causal. As a note, my thesis with Dino was on nuclear receptors such as RXR, RAR, VDR, GCR & co..

Are we set on using Hsap and Cele etc to describe species ? Again this is a non-standard, non-intuitive representation.

I would prefer the standard uniprot convention too, I think we discussed it once. But it's a larger issue independent of this page as this comes from noctua graph and our triplestore (and probably affects quite a lot of other resources). Maybe create a separate project or at least a ticket on minerva repo ?

In the Table describing the relations, you could simplify by removing the 'Description' link and making the relations themselves clickable ?

I don't have a strong opinion on this, if you think that's more readable, I can change it. My intent was to make the description explicitly visible/accessible. Just note that not all IRIs resolve to a web page (here those do), they are just identifiers.

'part_of' is a BFO term but is also present in RO - maybe 'occurs' in can also be added to RO, and in this case we may be able to claim we are using a single ontology ?

Currently, all the part_of in GO-CAMs refers to BFO, not RO, so this documentation has to reflect that so that users can create valid queries. In the ontology world, I don't know if that's better to state that we are using a single ontology ? part_of should probably never be in 2 ontologies in the first place, unless we mean something different.

lpalbou commented 3 years ago

@kltm I pushed the sparql.html standalone page to this repo: https://github.com/geneontology/geneontology.github.io/commit/66d66751ac9c79f8c79f2705bb374414d91dfa71

If you have difficulties deploying it on sparql.geneontology.org, please refer to the steps I took for S3/Route53 here: https://github.com/geneontology/geneontology.github.io/issues/267#issue-778497918 and which give us this page & route: http://sparql.geneontology.xyz

Thanks

cmungall commented 3 years ago

regarding the specific content to go on the static documentation page, let's set up another ticket for this page, as @lpalbou says:

This ticket is to deploy in production the new UI page for the GO SPARQL endpoint:

lpalbou commented 3 years ago

There is a little mix indeed, so let's be clear, there are two pages:

cmungall commented 3 years ago

I suggest a modification to the 2nd item in the checklist

As a general principle, I vastly prefer minimizing the number of moving pieces. I would rather not add an extra bucket just for deploying a static html file. While it is trivial, it adds to the overall complexity of configuration of the system, add introduces additional things that need synchronized.

My preference would be that all static html served to the public is done from a single source, which is this repo. We would simply check the html into this repo, and it would be deployed automatically. There is no need to set up an additional job to sync content from github to a new bucket. We would still need to have http://sparql.geneontology.org point at this page (3rd item in checklist).

lpalbou commented 3 years ago

Right. As it is, the page is already on GH and also accessible through: http://geneontology.org/sparql. So if you prefer without S3 bucket, I will replace step 2 by @kltm creating a proxy from sparql.geneontology.org to http://geneontology.org/sparql.

kltm commented 3 years ago

Okay, playing around a little with this, it seems like I have found settings for both forwarding and full proxying. Given what I think the uses are right now, I'm a little more comfortable with having the forward in place (less mechanism, less likely to get weird later). The forward is currently applied. @cmungall Do you have an opinion which way to go here?

lpalbou commented 3 years ago

Proxy seems to be working.

Other issues / recommendations can be added as separate tickets, deployment done.