geneontology / noctua

Graph-based modeling environment for biology, including prototype editor and services
http://noctua.geneontology.org/
BSD 3-Clause "New" or "Revised" License
37 stars 12 forks source link

New endpoint connection between Capella and Noctua #147

Closed kltm closed 8 years ago

kltm commented 9 years ago

This is essentially a continuation of #51.

From the call today, Hans-Michael (replace with github ID) will provide an example payload, as used for kicking to protein2go. Using this, noctua will:

Future iterations could also involve the user being able to login ahead of time (capella keeps the token) and/or doing the seeding remotely from capella with the API.

It is expected to go fairly easily as #51 established most of the code and paths.

goldturtle commented 9 years ago

Hans-Michael's github id is goldturtle :-)

kltm commented 9 years ago

Cheers! I've added you to the geneontology/noctua-community group. We'll see how this workflow goes and maybe change it later.

kltm commented 8 years ago

@vanaukenk will provide some feedback here as well. It would also be useful to have a description (lotsa detail) for the protein2go protocol/interaction. We could use that as a starting point as well.

goldturtle commented 8 years ago

A sample post, in URI format and as a JSON blob are as follows. This is only a model, parameters will change.

Using URI:

http://www.ebi.ac.uk/internal-tools/protein2go/InsertAnnotation?userid=test:vanauken@caltech.edu&source=BHFL&qualifier=NOT&db=UniProtKB&db_object_id=A0B0005&go_id=GO:0005040&evidence=IC&reference=PMID:10488343&with_from=&annotation_extension=&interacting_taxon_id=&annotation_date=2015-11-13 14:25:02&comments=

Using JSON, blob is like this:

{ "additional annotations" : "[7957|7971|nervous system|nervous system (WBbt:0005735)] [7965|7971|system|mf interaction assay (tpmfia:0000001)]", "annotator" : "goldturtle", "creation time" : "2015-11-13 14:40:10", "db" : "UniProtKB", "db_object_id" : "A0A005", "evidence" : "ECO:0000314", "filename" : "/PMCOA Biology/PLoS_Biol_2011_Aug_9_9(8)_e1001121/pbio.1001121.tpcas.gz", "go_id" : "GO:0050689", "paper id" : "PMID 21857800", "positions" : "(7953,7996)", "reference" : "PMID:666333", "terms" : "The nervous system of most animals consists ", "version" : "1.0" }

kltm commented 8 years ago

Terrific--we'll use this as a jumping-off point and refresh the API with an eye towards it.

To clarify, the URL you have there is the GET (parameters in the URL) version of the POST (parameters in header, and possibly in URL as well) that you'd do? I just want to make sure I know a large enough set of the parameters involved.

goldturtle commented 8 years ago

Yes, I POST this, but all of the parameter are in the URL

kltm commented 8 years ago

Understood--thanks!

kltm commented 8 years ago

In discussion with @goldturtle and @vanaukenk; a new proposed format:

{
"model_id": "gomodel:01234567",  // optional
"barista_token": "sdlkjslkjd",
"db" : "UniProtKB",
"db_object_id" : "A0A005",
"evidence" : "ECO:0000314",
"class_id" : "GO:0050689",
"reference" : "PMID:666333",
"textspresso_id" : "XXX:YYYYYYY",
"comments" : ["foo", "bar"]          // optional, default empty
}

"comments" is slightly structured text, useful for curators.

kltm commented 8 years ago

POST URL (JSON return) for experimentation is at:

http://noctua.berkeleybop.org/tractorbeam

Arguments TBD, but grossly matching the JSON blob above, will add details as implemented.

kltm commented 8 years ago

The JSON blob as described here has been modified with two additional arguments: "model_id" and "barista_token". https://github.com/geneontology/noctua/issues/147#issuecomment-178803868 These are necessary as part of the round trip and would have been given as part of the opening action of #283.

cmungall commented 8 years ago

On 2 Feb 2016, at 12:26, kltm wrote:

In discussion with @goldturtle and @vanaukenk; a new proposed format:

{
"db" : "UniProtKB",
"db_object_id" : "A0A005",
"evidence" : "ECO:0000314",
"go_id" : "GO:0050689",

Should be term_id or class_id

We may want to send cell types IDs, etc...

These will end up 'on the floor' with the curator connecting together

"reference" : "PMID:666333", "textspresso_id" : "XXX:YYYYYYY", "comments" : ["foo", "bar"] }

"comments" is slightly structured text, useful for curators.

---
Reply to this email directly or view it on GitHub:
https://github.com/geneontology/noctua/issues/147#issuecomment-178803868
kltm commented 8 years ago

Updated to "class_id".

kltm commented 8 years ago

Will need to discuss with @cmungall and @hdietze about modelling the "textspresso_id" before finalizing.

kltm commented 8 years ago

Discussed with @cmungall and @hdietze : we'll model as the evidence individuals, but with a different URI. This will be added to the backed by @hdietze . In the meantime, I'll just add it in as a comment.

kltm commented 8 years ago

To make this more abstract for any incoming datasource to make use of, I've simplified the JSON a bit:

{
 "model_id": "gomodel:01234567", // optional, create new model if not extant
 "barista_token": "sdlkjslkjd", // required
 "database_id" : "WB:WBGene00005794", // required
 "evidence_id" : "ECO:0000314", // required
 "class_id" : "GO:0050689", // required
 "reference_id" : "PMID:666333", // required
 "external_id" : "XX:YYYYYYY", // optional
 "comments" : ["foo", "bar"]          // optional, default empty
}

If okayed by @goldturtle , this will be the provisional format accepted by /tractorbeam.

kltm commented 8 years ago

Using httpie, access commands could look like: New model:

http --form localhost:8910/tractorbeam barista_token=123 database_id=WB:WBGene00005794 evidence_id=ECO:0000314 class_id=GO:0050689 reference_id=PMID:666333 external_id=XX:YYYYYYY model_id=gomodel:56b285cf00000012

Add to model:

http --form localhost:8910/tractorbeam barista_token=123 database_id=WB:WBGene00005794 evidence_id=ECO:0000314 class_id=GO:0050689 reference_id=PMID:666333 external_id=XX:YYYYYYY
kltm commented 8 years ago

Now in production.

kltm commented 8 years ago

To clarify https://github.com/geneontology/noctua/issues/147#issuecomment-179556417 , while I'm using a JSON blob to define the variables, the actual arguments are not in a JSON blob, the actual arguments are just application/x-www-form-urlencoded form fields, as demonstrated by the examples using httpie above.

goldturtle commented 8 years ago

Hi there, I am trying to post to noctua, but to no avail. See server response below. I probably don't have the correct token; if that's the case then please send one via email.

Thanks, Michael.

Server: http://noctua.berkeleybop.org/tractorbeam?barista_token=123&database_id=WB:WBGene00005794&comments=meh&model_id=gomodel:01234567&evidence_id=ECO:0000314&class_id=GO:0050689&reference_id=PMID:666333&external_id=TPDBID:0000181 Response:404 {"message-type":"error","message":"no POST data","commentary":"meh"}

kltm commented 8 years ago

This is currently correct: the endpoint only accepts POST right now: https://github.com/geneontology/noctua/issues/147#issuecomment-179556552 In the current framework, there is no general abstraction between the two in specific use case, so we started with POST.

kltm commented 8 years ago

From https://github.com/geneontology/noctua/issues/147#issuecomment-180095475 it's an application/x-www-form-urlencoded body.

kltm commented 8 years ago

This command still works for me:

http --form localhost:8910/tractorbeam barista_token=123 database_id=WB:WBGene00005794 evidence_id=ECO:0000314 class_id=GO:0050689 reference_id=PMID:666333 external_id=FOO:0123456
goldturtle commented 8 years ago

The localhost URI doesn't do me any good.

I am posting. If I cut and paste the URL provided above into a browser which performs a GET, I am correctly pointed out to my mistake Response: {"message-type":"error","message":"no GET endpoint","commentary":"try POST instead of GET"}

kltm commented 8 years ago

Corrected to point to the production server as an example (not recommended for typical development though):

http --form noctua.berkeleybop.org/tractorbeam barista_token=123 database_id=WB:WBGene00005794 evidence_id=ECO:0000314 class_id=GO:0050689 reference_id=PMID:666333 external_id=FOO:0123456

One cannot cut and paste into a regular browser a POST--most browsers will default to a GET.

goldturtle commented 8 years ago

One cannot cut and paste into a regular browser a POST--most browsers will default to a GET.

I was aware of it, I just posted the browser example to test that GET indeed doesn't work.

What puzzles me that even though my POST is x-www-form-urlencoded, your application still expects some POSTDATA, as indicated by the original response:

{"message-type":"error","message":"no POST data","commentary":"meh"}

if I post x-www-form-urlencoded, I leave the message body empty, thus this response.

So it seems your app is analyzing and expecting POSTDATA when I send something x-www-form-urlencoded. httpie seems to repeat all parameters in the body, because if I add to the body, all parameter by hand:

// message.addBodyText(msg); // msg="" if iparamformat == CurationFormsConfiguration::URI message.addBodyText("barista_token=123&database_id=WB:WBGene00005794" "&evidence_id=ECO:0000314&class_id=GO:0050689&reference_id=PMID:666333" "&external_id=FOO:0123456");

I get the correct Response from server: {"packet-id":"459f939415aefe127","uid":"GOC:kltm","is-reasoned":false,"intention":"action","signal":"rebuild","message-type":"success","message":"success","data":{"modified-p":true,"id":"gomodel:56cbaef000000065","individuals":[{"id":"gomodel:56cbaef000000065/56cbaef000000067","type":[{"type":"class","id":"WB:WBGene00005794","label":"srw-47 Cele"}],"annotations":[{"key":"contributor","value":"GOC:kltm"},{"key":"date","value":"2016-02-23"}]},{"id":"gomodel:56cbaef000000065/56cbaef000000066","type":[{"type":"class","id":"GO:0050689","label":"negative regulation of defense response to virus by host"}],"annotations":[{"key":"date","value":"2016-02-23"},{"key":"contributor","value":"GOC:kltm"}]},{"id":"gomodel:56cbaef000000065/56cbaef000000068","type":[{"type":"class","id":"ECO:0000314","label":"direct assay evidence used in manual assertion"}],"annotations":[{"key":"source","value":"PMID:666333"},{"key":"date","value":"2016-02-23"},{"key":"contributor","value":"GOC:kltm"}]}],"facts":[{"subject":"gomodel:56cbaef000000065/56cbaef000000066","property":"RO:0002333","object":"gomodel:56cbaef000000065/56cbaef000000067","annotations":[{"key":"evidence","value":"gomodel:56cbaef000000065/56cbaef000000068","value-type":"IRI"},{"key":"contributor","value":"GOC:kltm"},{"key":"date","value":"2016-02-23"},{"key":"comment","value":"FOO:0123456"}]}],"properties":[{"type":"property","id":"RO:0002333","label":"enabled_by"}],"annotations":[{"key":"date","value":"2016-02-23"},{"key":"contributor","value":"GOC:kltm"},{"key":"state","value":"development"}]}}

I think it would be good if your application doesn't analyze the message body when posting x-www-form-urlencoded, or at least doesn't break.

goldturtle commented 8 years ago

I think it would be good if your application doesn't analyze the message body when posting x-www-form-urlencoded, or at least doesn't break.

Actually, I'd like to modify that; let's do whatever the common convention on this is. (One would need to look that up on the web). I know EBI is analyzing the URL only, but not the message body.

kltm commented 8 years ago

@goldturtle I'm not sure I follow you here, or the alternative being propsed. While one could extract URL parameters from both GET and POST requests, the size and nature of the requests in our case make them unsuitable for URL parameter (and thus GET) requests. When using POST, the use of the body to transfer data is how it is supposed to operate. And when encoding data in the body, one has to choose a particular mechanism, which in our case is the standard x-www-form-urlencoded. How to accomplish these things is going to be particular to the client that one uses. I'll give examples with other clients.

kltm commented 8 years ago

Whoops, sorry, crossed there. Shall I still give some examples with other client calls?

goldturtle commented 8 years ago

OK, there must be a convention for that, as I said. Let's follow that convention, because Textpresso will have other clients, as you do.

goldturtle commented 8 years ago

...and with clients I mean customers, not http client ;-)

kltm commented 8 years ago

Okay, let's see where we are and maybe future-proof this a bit. We currently have (ignoring the JSON formatting) the following arguments: https://github.com/geneontology/noctua/issues/147#issuecomment-179556417 The main thing that makes it difficult do encode as a URL is the "comments" list, which could be quite long. Do you see a case where a user would like to send multiple requests to the noctua enpoint in a single batch? The theoretical use case would be that somebody did a bunch of work in Textpresso and wants to export multiple annotations into the same model. Would this be a possible future scenario? If so, we might want to abstract the payload into an encoded JSON blob itself. Thinking about that, I can think of a few cases where batch might be a good idea. Any thoughts there?

goldturtle commented 8 years ago

For user scenarios, we should also consult with our model-curator, Kimberly. I would assume they do one annotation at a time, even to the same model.

On 02/23/2016 04:34 PM, kltm wrote:

Okay, let's see where we are and maybe future-proof this a bit. We currently have (ignoring the JSON formatting) the following arguments:

147 (comment)

https://github.com/geneontology/noctua/issues/147#issuecomment-179556417 The main thing that makes it difficult do encode as a URL is the "comments" list, which could be quite long. Do you see a case where a user would like to send multiple requests to the noctua enpoint in a single batch? The theoretical use case would be that somebody did a bunch of work in Textpresso and wants to export multiple annotations into the same model. Would this be a possible future scenario? If so, we might want to abstract the payload into an encoded JSON blob itself. Thinking about that, I can think of a few cases where batch might be a good idea. Any thoughts there?

— Reply to this email directly or view it on GitHub https://github.com/geneontology/noctua/issues/147#issuecomment-187987536.

kltm commented 8 years ago

@vanaukenk The more I think about it, the more I'm tempted to formalize the spec that we're going to use with batching possible. Is this a possible use case? The implication would be that you could spend more time within Textpresso, collecting annotations, and then dump them into Noctua in a single go. The engineering implication would be that instead of having a POST with discrete arguments, we'd have a POST the encoded at JSON array, with elements described as in https://github.com/geneontology/noctua/issues/147#issuecomment-179556417 .

vanaukenk commented 8 years ago

Hi @ktlm and @goldturtle I think the batch curation scenario is definitely possible and actually quite likely. In that scenario, a curator would read a paper, make all possible annotations, and then send them all to a given Noctua model. I just did that manually today for a new model I was creating. The alternative case of individual annotations is definitely still a viable scenario, though; it would really just depend on what workflow fits the particular curation status of the model. Let me know if more specific details would help.

kltm commented 8 years ago

@vanauken Thank you, I think that helps. I think to give us the maximal flexibility moving forward, it's probably best to bake in this scenario from the beginning. I'll implement this into Noctua and ping back here (with examples) once this is live for testing.

kltm commented 8 years ago

Now looking at modelling like:

{
   "barista-token": "sdlkjslkjd",
   "model-id": "gomodel:01234567",
   "requests": [
      {
         "database-id" : "UniProtKB:A0A005",
         "evidence-id" : "ECO:0000314",
         "class-id" : "GO:0050689",
         "reference-id" : "PMID:666333",
         "textspresso-id" : "XXX:YYYYYYY",
         "comments" : ["foo", "bar"]
      }
   ]
}

The "model-id" is optional depending on whether you're generating a new model or editing a current one. I've also updated this so that the arguments sent are more like the results produced: the "_"s have been replaced by "-"s.

kltm commented 8 years ago

Example command, using Content-Type: application/json :

http --json localhost:8910/tractorbeam barista-token=123 requests:='[{"database-id":"WB:WBGene00005794", "evidence-id":"ECO:0000314", "class-id":"GO:0050689", "reference-id":"PMID:666333", "external-id":"FOO:0123456"}]'

So we would no longer be using application/x-www-form-urlencoded.

kltm commented 8 years ago

Pushed to production.

goldturtle commented 8 years ago

Cool.

I had coded for both options, but didn't post that fact on github. It's good to have both options available, in case each of us want to communicate with other projects.

M.

On 02/26/2016 06:54 PM, kltm wrote:

Example command, using /Content-Type: application/json/ :

http --json localhost:8910/tractorbeam barista-token=123 requests:='[{"database-id":"WB:WBGene00005794", "evidence-id":"ECO:0000314", "class-id":"GO:0050689", "reference-id":"PMID:666333", "external-id":"FOO:0123456"}]'

So we would no longer be using /application/x-www-form-urlencoded/.

— Reply to this email directly or view it on GitHub https://github.com/geneontology/noctua/issues/147#issuecomment-189563203.