Ship fixed versions of queries with each nidm-api release

chrisgorgo commented 8 years ago

At the moment each time someone uses nidm-api they need to download the queries. Those will change and a given nidm-api version might behave differently depending on what version of queries were used. This will make debugging applications that use nidm-api difficult (think of a scenario where someone used continuous integration and despite fixing versions of all of the dependencies something breaks from one commit to another due to query update). To avoid such situation I would suggest that each release of nidm-api would use a fixed set of queries tied to specific versions. When someone updated the nidm-api package through pypi they will get new and/or updated queries.

This is related to #23.

vsoch commented 8 years ago

This seems like a good idea - the software (nidm-api) can serve queries that are specified for its version, which is a field in the query json. I think we have a few options. We could, for each version, test the queries and update the version field when its been shown to work for the current version, and this would also be stating that the query is backward compatible to all older versions. We could also be very stringent and have the version field include a list of specific versions, but that could get long and hairy after some time, and would require updating each query object with a new nidm-api results, which is not ideal. Another option is to produce equivalent query json files for each version, and then have the API filter down to those for its version. That also seems to introduce a lot of redundancy.

I think we might just give the user information about the versions of queries. By default the most recent version will always be delivered (and will be guaranteed to function based on tests with nidm-api), but we should encourage users to specify the version of the query when writing applications.

Are you suggesting that installation of the nidm-api be linked with a static set of queries, so they aren't updating automatically whenever a user runs the API? How can we link the nidm-api package with specific queries without introducing redundancy or making it annoying to update all query objects?

nicholsn commented 8 years ago

would it make sense to have test cases for each query, or is that overboard? For example, using CI where there is some canonical NIDM file for a given query(ies) with specified output so we know when specific queries break during a version change.

On Mon, Jan 11, 2016 at 9:42 AM, Vanessa Sochat notifications@github.com wrote:

This seems like a good idea - the software (nidm-api) can serve queries that are specified for its version, which is a field in the query json https://github.com/incf-nidash/nidm-query/blob/master/sparql/b73b423e-0660-42a1-a7d3-48e300f44872.json#L30. I think we have a few options. We could, for each version, test the queries and update the version field when its been shown to work for the current version, and this would also be stating that the query is backward compatible to all older versions. We could also be very stringent and have the version field include a list of specific versions, but that could get long and hairy after some time, and would require updating each query object with a new nidm-api results, which is not ideal. Another option is to produce equivalent query json files for each version, and then have the API filter down to those for its version. That also seems to introduce a lot of redundancy.

I think we might just give the user information about the versions of queries. By default the most recent version will always be delivered (and will be guaranteed to function based on tests with nidm-api), but we should encourage users to specify the version of the query when writing applications.

Are you suggesting that installation of the nidm-api be linked with a static set of queries, so they aren't updating automatically whenever a user runs the API? How can we link the nidm-api package with specific queries without introducing redundancy or making it annoying to update all query objects?

— Reply to this email directly or view it on GitHub https://github.com/incf-nidash/nidm-api/issues/24#issuecomment-170629609 .

vsoch commented 8 years ago

Testing is a given, it seems like the minimal requirement for any software that is taken seriously. It would be pretty easy to determine if a query works (aka, doesn't return an error, returns some data structure as expected) but it's going to be harder to define specific output for each query. We also have the added confound of different versions of the nidm-results, etc, and to me that seems like the biggest place to introduce error as the standards change. I think what is needed is a tightly controlled release structure / workflow for all of these things, with tests to support.

chrisgorgo commented 8 years ago

Yes - CI would be very good to have, but that's a different issue.

For the versioning I would envisage something like this: user installs a particular version of nidm-api that ships with a set of queries. They can immediately do:

from nidm.query import do_query

result = do_query(ttl_file="some_nidm.ttl", 
                             query_id="7950f524-90e8-4d54-ad6d-7b22af2e895d")

This operation would not require any downloads from external servers since the query will be part of the package. The version of the requested query would be the one shipped with the particular version of nidm-api installed by the user. However user can also specify a particular version:

result = do_query(ttl_file="some_nidm.ttl", 
                             query_id="7950f524-90e8-4d54-ad6d-7b22af2e895d",
                             version="1.2")

or

result = do_query(ttl_file="some_nidm.ttl", 
                             query_id="7950f524-90e8-4d54-ad6d-7b22af2e895d",
                             version="latest")

This operation would consult https://github.com/incf-nidash/nidm-query.git to get the latest versions of queries and thus would be slower.

How do do this without causing redundancies? We would need to copy specified (hardcoded in the nidm-api) version of queries from either during installation or just before uploading to pypi (this can be done on circleCI).

Let me know what do you think.

vsoch commented 8 years ago

I like the idea of installing queries when the package is installed, and not always calling github (unless requested).

nicholsn commented 8 years ago

ya, I like the idea of not always consulting github. Could it be setup something like homebrew where you can do periodic updates or checkout specific versions of the nidm-query repo? Something like:

nidm update or `nidm checkout 1.2"

Also, how are we thinking the versioning works here since each NIDM component (results, experiment, etc) will evolve separately with their own versions, right? Does that mean it can't be tied to the nidm repo?

On Mon, Jan 11, 2016 at 10:21 AM, Vanessa Sochat notifications@github.com wrote:

I like the idea of installing queries when the package is installed, and not always calling github (unless requested).

— Reply to this email directly or view it on GitHub https://github.com/incf-nidash/nidm-api/issues/24#issuecomment-170641027 .

chrisgorgo commented 8 years ago

On Mon, Jan 11, 2016 at 10:26 AM, Nolan Nichols notifications@github.com wrote:

ya, I like the idea of not always consulting github. Could it be setup something like homebrew where you can do periodic updates or checkout specific versions of the nidm-query repo? Something like:

nidm update or `nidm checkout 1.2"

Technically we could, but I don't think it's a good idea. I would leave updates to pypi and pip. Otherwise it will be confusing for the users.

Also, how are we thinking the versioning works here since each NIDM component (results, experiment, etc) will evolve separately with their own versions, right? Does that mean it can't be tied to the nidm repo?

I'm not sure what do you mean. In this issue we have so far only discussed different versions of queries not different versions of the NIDM-Results standard.

incf-nidash / nidm-api

Ship fixed versions of queries with each nidm-api release #24