Add nightly tests for read api

dorukozturk commented 6 years ago

Enhancement

Add nightly tests for pynwb.

Problem/Use Case

In order to avoid regressions with the read API, check diverse set of nwb files against pynwb dev branch. I prefer not to slow down our test suite, that is why I am proposing nightly tests.

I am thinking:

1) We upload 5-10 diverse NWB files to a public server (hoping to collect different files from different labs to cover as much as possible) 2) Every night a cron job starts and downloads those files 3) Tests the read api against these diverse sample files 4) Reports back the results (it can be a separate badge or email)

Does this sound good?

bendichter commented 6 years ago

What's the protocol if it reveals a bug in read?

dorukozturk commented 6 years ago

@bendichter I can think of 2 cases and they have different solutions. Both cases require making the tests pass.

a) We unintentionally broke it, that means we have to revert part of the code which breaks the read API for the sake of backwards compatibility.

b) We intentionally change the read API, which means we have to fix the tests to reflect the new read API on tests and bump up the version of pynwb so that people can track breaking changes and pin specific versions in their code so that they don't get surprises.

Is this reasonable?

oruebel commented 6 years ago

Creating support for nightly tests to do longer-running tests sounds good. In addition to read, I think you would also want to have round-trip tests in there, i.e. write a file and read it back.

I'm adding @nclack as they have been looking at doing some of this for the MatNWB MatLab API as well and this issue seems relevant to the round-trip testing between PyNWB and MatNWB to test that the PyNWB can read files written with MatNWB and vice-versa.

@dorukozturk what implementation plan do you have in mind for this?

jcfr commented 6 years ago

nightly tests to do longer-running tests sounds good

This could be setup as TravisCI cron job. Here is an example of such job.

The script on the travis repo, could also trigger a parameterized CircleCi build (example of such trigger here)

dorukozturk commented 6 years ago

@oruebel Adding round trip tests sounds great. That kind of end to end testing is very valuable for stability.

Circleci has a feature for setting up cron jobs (https://circleci.com/docs/2.0/workflows/#scheduling-a-workflow). I am planning to add a new workflow to our current circle config file which will trigger certain tests at midnight. @oruebel How does that sound?

@jcfr do you see any obstacles with this approach?

oruebel commented 6 years ago

Just FYI, here the pull request with the test-suite for the MatLab API: https://github.com/NeurodataWithoutBorders/matnwb/pull/23

I'm mainly adding this to make sure you are aware of what is going on that end. We probably want to run these tests separately for PyNWB as well, but I think these kind of tests may potentially be a good basis for testing compatibility between the APIs.

@jcfr @dorukozturk do you have any particular tests ready or already in mind for this?

dorukozturk commented 6 years ago

I have 3 kinds of tests in mind:

1) Find sample files (hopefully from different labs) that are diverse enough so that I can just do simple .read() on them. 2) Round trip or end to end tests (I might need some help from you guys here) 3) Test code on documentation page. We have nice examples there but time to time they don't work so people cannot follow along. I want to cover those examples too.

I would be more than happy to discuss/add/remove items from the list above. If all of the above sounds good, I will work on the PR.

oruebel commented 6 years ago

Test code on documentation page

@ajtritt is working on making the code as sphinx-galleries which should make this easier https://github.com/NeurodataWithoutBorders/pynwb/tree/enh/sphinx_gallery

Round trip or end to end tests

I think the base setup for this is here https://github.com/NeurodataWithoutBorders/pynwb/tree/dev/tests/integration

Find sample files

Generally speaking, I think most files can be "made" small, i.e., the bulk of the data is going to be in large arrays, e.g., for time series. For most tests, it should not matter whether you are writing a 5x100 or 100x1000000 array so you can just trim down these arrays for testing.

jcfr commented 6 years ago

Circleci has a feature for setting up cron jobs

Neat. :+1:

do you see any obstacles with this approach?

Leveraging CircleCI for this kind of test makes perfect sense . The CircleCI caching mechanism can also be very useful to speedup testing by avoiding downloading of python packages and/or data.

mgrauer commented 6 years ago

Would it make sense to also run these longer tests on merges to master?

bendichter commented 6 years ago

This only makes sense for tests that will take a long time. For quick tests, it would be better to add them to the integration tests that are already in place. Do you have specific tests in mind that take a long time to run?

I have a conversion script for publicly available data from the Buzsaki Lab that takes a few minutes. I'm happy to contribute that as soon as I work out the last few remaining bugs. We could run the conversion script and then run read on it to test round trip.

The code on the documentation page is currently run as an integration test. It failed recently in a way that wasn't tested: the file could be written without error, but produced an error when trying to read it. To fix this, we just need to add a file.read() before we delete it. I think that addition of read would not add much time to the test, and I'd prefer to catch these things at the integration test if we can.

bendichter commented 6 years ago

@mgrauer We currently don't have a master branch, but I would be in favor of creating one and, as you suggest, running comprehensive integration tests on merge

mgrauer commented 6 years ago

Sorry @bendichter, I meant dev branch in this repo. I believe we are currently doing this pre/post merge split of tests, where some CIs (OS platforms) are run on PRs and all are run after merges into dev. These tests (if they are long running) could also be added to the post merge tests.

@dorukozturk Can you confirm or correct me here?

dorukozturk commented 6 years ago

@mgrauer That is correct. We run the macOS tests only after the merge. Some of the tests can go there.

@bendichter I have some NWB files that I downloaded from Allen Brain Atlas. Currently I can't read them and they average in ~300-350 MB range. I really want to add some of those and try to read them in tests. So the test itself won't take time at all, but the download will. I don't know limits of circleci cache. But i definitely agree the more we have in the integration tests side the better it is. I will do my best to keep the tests at the integration and if it takes forever, I will move them to the nightly.

I would really appreciate use cases, so that I can convert them to actual tests. Let me know when you are done with your use case so we don't break pynwb for you in the future.

So, the outcome is I will try to add tests to integration tests as much as possible and when it gets slower I will try to add them to post merge tests, and if that does not work I will do nightlies with bunch of downloads and reads.

oruebel commented 6 years ago

I downloaded from Allen Brain Atlas

note that these are most likely NWB 1.0.x and not NWB 2.0 files, i.e., for that you need to use the legacy read modules in PyNWB not the standard read. While testing the legacy read modules is good too, I think it is important that we cover the main read as that is what most folks will use.

dorukozturk commented 6 years ago

@oruebel Yes, that was the case. I agree we should put more weight on 2.0 files. I am hoping to collect some during the hackathon.

bendichter commented 6 years ago

@dorukozturk That sounds like a good application, and I agree the

integration tests < post merge tests < nightly tests

strategy sounds smart. I'm happy to give you my scripts, which are currently in my to_nwb repo, but aren't 100% working yet. If we add tests for 2.0 functionality, realize that the API does change intentionally fairly often, so we'll have to change the tests too.

bendichter commented 6 years ago

@dorukozturk OK, I have a conversion script ready to be added as a nightly test. The code is here. What's the best way for me to send you the data? If you can use Globus Connect I can give you read permissions to a shared server space with the data.

NeurodataWithoutBorders / pynwb

Add nightly tests for read api #399

Enhancement

Problem/Use Case