OpenTreeOfLife / germinator

miscellaneous scripts and data for concerns that span more than one of the Open Tree code repositories: integration tests, system statistics, etc.
BSD 2-Clause "Simplified" License
21 stars 7 forks source link

Need a way to validate OTT ids and node ids #80

Open jar398 opened 8 years ago

jar398 commented 8 years ago

In the v2 API, if a method found an unrecognized , the id was put in a special list, which was returned as a special result. In the v3 API, you get a 400 error if any id is invalid. We should probably have a way to recover the v2 behavior or something equivalent to it using either an option to all methods that take ids as arguments, or as a separate validation method. The latter is cleaner and more orthogonal, but would be a pain for users and for performance.

I think there is a separate issue somewhere about having propinquity give explanations of why a taxonomy node failed to make it into the synthetic tree; that should be coordinated with this more general question, which has to do with OTT ids that were deprecated long ago, or mistyped.

This will come to a head when we discontinue the v2 API, because this is one thing you can do with the v2 API that you can't easily do with the v3 API. (In v3, to find out what the unrecognized ids were, you have to parse the error message, because neo4j doesn't let us return json error responses.

jar398 commented 8 years ago

Options for doing this:

  1. Revert to something similar to v2 behavior, adding extra fields to result
  2. Same as 1. but only when enabled when requested by a flag parameter
  3. Define a new method specifically for validating ids. Return value would be list(s) of invalid ids. Client would call that first, then use it to filter the id list for a subsequent mrca or induced subtree call
josephwb commented 8 years ago

FWIW I like 1 myself for routine use (e.g. getting mrcas in curator to check ingroups against synthetic tree).

Having 3 around seems generally useful.