Open dustine32 opened 2 years ago
Ah yes, here's the other one. Let's talk when you're back on making sure we're on the same page with https://github.com/geneontology/minerva/issues/448 .
Whre is the GO-CAM browser. site?
GO-CAM browser site [edited to correct perma-url -kltm]
@dustine32 has a possibly easy upfront fix with geneontology/pipeline#302. Testing now.
@pgaudet We may wrap this before even starting up. With the fix that @dustine32 has, http://geneontology.org/go-cam is now displaying 217 models and is quite snappy. Is this a sufficient fix for now?
Cool, I had totally forgotten about this. Note that the link to the models from the "Browse models" tab is broken.
http://noctua.geneontology.org/editor/graph/gomodel:5745387b00001516
Can you explain what this is doing ?
217 models seem a bit modest to me - is that all the connected models we have?
I would expect this one to be in: gomodel:6246724f00001921 It does show in Alliance: https://www.alliancegenome.org/gene/HGNC:20749 but when I search for gene name in the GO-CAM browser (ZDHHC20)I get 0 results.
Also, I would expect gomodel:60ad85f700000058 NOT to be in; there some connected activities, but these are binding, and especially, there are many unconnected activities. Interestingly, this one is NOT in the mouse Alliance Atf2 page: https://www.alliancegenome.org/gene/MGI:109349
Are we using different rules than Alliance?
Thanks, Pascale
@pmasson55 estimates that about 100 Swiss-Prot models might be missing
I thought it was low. I tried to browse by species. "elegans" gave 2 models.
@pgaudet In general, it's querying all models containing a chain of at least two consecutive causal relations ("causally upstream of or within" RO:0002418 or descendants) connecting MF nodes. The actual query is here.
The query to retrieve models "by gene" for the Alliance site should be very similar to the "all causal models" query used for the GO-CAM browser site, but your examples here show that there is some difference that needs to be fixed. I'll make a ticket in the api-gorest-2021 repo to debug this. @pgaudet Thanks for the examples!
Thanks for the detailed reply @dustine32 !
@dustine32 I wanted to follow up on this a little and make this is current. Looking at our current script, we have:
wget http://localhost:8888/models?causalmf=2 -O gocam-models.json'
that is generating the json that gets put into the release. This then gets propagated out (automatically) to the API now defined by https://github.com/geneontology/api-gorest-2023, correct? While we have a more expansive version of this with tagging models in metadata, this first pass is done, correct?
@kltm Oh right, I believe this ticket is done.
And actually, this GO-CAM site pulls these JSONs directly from S3 rather than using a GO-CAM API passthrough: https://github.com/geneontology/web-gocam/blob/cf224799ac5a644d104faa427f3695d4e8f6148c/src/app/core/gorest.service.ts#L37
@dustine32 yes, right you are; pushed on this bonus release pipeline: https://github.com/geneontology/pipeline/blob/f068cce4a3c1b84869e2f4cf501ac5b6c57e2e52/Jenkinsfile#L509
As suggested by @thomaspd, we should only show the most relevant (i.e., causal) GO-CAM models on the GO-CAM browser site.
I believe a fast way (perhaps we could call it a hack) to implement this is to just add the
?causlmf=2
parameter to the API call used to generate the cachedgocam-models.json
file. The other three files shouldn't require any change as they're just lookup files for the maingocam-models.json
file.