OpenTreeOfLife / phylesystem-api

API access to Open Tree of Life treestore
BSD 2-Clause "Simplified" License
10 stars 5 forks source link

Run validation code on POSTed NexSON #23

Closed leto closed 10 years ago

leto commented 10 years ago

@mtholder has written code which will validate NexSON and produce errors (a JSON blob that maps to XML that violates the NeXML schema) and warnings (files that are not ideal with respect to OpenTree's needs).

More details on the validator are here: https://github.com/OpenTreeOfLife/api.opentreeoflife.org/blob/master/nexson-validator/README.md

If there is an error, the POST should be rejected and the errors should be returned as JSON, so users of the API can report what is wrong. If no errors are present, the POSTed data should be committed, as it is now.

Additionally, the warning annotations (if any exist) should be made as an additional commit on the given WIP branch for the given user and study.

leto commented 10 years ago

@jimallman do you have feedback about how the curation app will want to handle pulling the latest version of the NexSON? When the validation step above is implemented, the app will need to check for a (possibly) updated version of the NexSON in the WIP branch which contains warning annotations.

mtholder commented 10 years ago

So, I'm a bit murky on the git details. My impression is that each file blob has a SHA which only changes when the file is touched. right? If that is right, it seems like we might want the api layer to embed the file version's SHA in the response every time we respond with a NexSON response. That would make it feasible for the client side to frequently ask "give me a the latest version of the file, if it has changed since this SHA..." Most of the time this would be a very fast response with an empty object, but when the validator completes the client will be able to tell that there is a new version of the file. Or should we be relying on commit SHAs rather than file blobs?

leto commented 10 years ago

Blob SHA1's are not as useful external tools as commit SHA1's and I think returning the latest commit SHA1 of the WIP branch is a good idea. That will allow tools to ask for the most up-to-date file contents with a single Github API call

leto commented 10 years ago

We now return the SHA of the latest commit in our response to POSTs :

https://github.com/OpenTreeOfLife/api.opentreeoflife.org/commit/7a3aad53015230416968de4809d54ab48c69140c

mtholder commented 10 years ago

OK, so there is now a branch that has some, but not all of the logic. See https://github.com/OpenTreeOfLife/api.opentreeoflife.org/tree/validate-on-write I have not tested it, and would appreciate some pointers about how I can get an auth_token to test from a command-line client. Or perhaps I should just wait until we move to local git?

Currently it is sketched out to

  1. run the validation.
  2. commit the unaltered nexson content
  3. commit the annotated version.
  4. return the sha of the latter commit. this differs from idea sketched out above because we don't return after step 2.

I wanted to test for basic NexSON validity before step 2. So that means that the most expensive step (the validation) will happen before the "raw" commit. If we agree that this is a reasonable approach, then it probably makes sense to block until step 3 finishes (because step 3 is fast after the validation is done).

jimallman commented 10 years ago

I have not tested it, and would appreciate some pointers about how I can get an auth_token to test from a command-line client.

You can either use the Github APIs to authenticate (with your own GitHub profile email, of course):

$ curl -u "jim@example.com" https://api.github.com/authorizations

...or just load the curation editor (currently requires login via OAuth) and grab the token from the page using Firebug:

console.log(authToken)
leto commented 10 years ago

There are instructions about getting an auth_token in the README.md here: https://github.com/OpenTreeOfLife/api.opentreeoflife.org/blob/master/README.md as well

mtholder commented 10 years ago

Cool. thanks, Jim and Duke. Sorry for not looking around more.
So, that validate-on-write branch is now passing the tests in ws-tests/run_tests.sh I'm haven't issued a pull request, because I assume that is may be easier to wait until the local-git stuff is working... but feel free to merge it if you like, Duke.

leto commented 10 years ago

This looks great! I will merge your branch first, since it will most likely not generate conflicts (or have very few), where as the "local" branch I am working will inevitably have conflicts. This should make it easier for you to continue working on master if you want to make more changes to nexson validation.

leto commented 10 years ago

The validate-on-write branch has been merged into master.