scidash / neuronunit

A package for data-driven validation of neuron and ion channel models using SciUnit
http://neuronunit.scidash.org
38 stars 24 forks source link

Make unit tests for all the VmTest subclasses #89

Closed rgerkin closed 5 years ago

rgerkin commented 7 years ago

These should be based on stored vm traces, rather than ones generated on the fly from one of the model classes. Requires #88.

rgerkin commented 7 years ago

@russelljjarvis I am working on this quite a bit today. I will be avoiding the NEURON backend and just testing using the jNeuroML backend, but it will be easy to change one value to test using the NEURON backend later.

russelljjarvis commented 7 years ago

Great, great.

On this issue I can't figure out if NEURON should share exactly the same implementation as jNeuroML. Reading and writing from memory will obviously be faster than reading and writing from disk.

It seems there are only two different types of current injection occurring inside the eight tests of the test suite. So it would seem that 8 simulations can be collapsed into 2 simulations. I assume thats what you are thinking for the case of jNeuroML also excepting for the disk versus memory caching of precomputed results.

rgerkin commented 7 years ago

@russelljjarvis Yes, the caching of simulated results is something we will want to make sure we do.

In a serial implementation, this should happens automatically in the run method of LEMSModel shown here. If the run parameters are the same it should just look up the results in a stored attribute of the model.

So if the same model is called by many tests in a suite, the simulation should only happens as many times as there are unique run-time parameters, and otherwise there should just be a look up on self.results. However, this only works if it is same model in memory the entire time. If the model is instantiating multiple times (e.g. on different processors) then it can only use the stored result if the original pickling happened after the first simulation using those run-time parameters was already run, and then the unpickled version would have it already. But I don't think this happens in the usual parallel implementation because pickling is happening before those simulations are run. Also, this whole scheme only works when the run-time parameters are identical to the immediately previous simulation, not just any simulation in the past, so if you ran a model with only two sets of run-time parameters, but alternating each time, it would never get to exploit any stored results.

Future directions include:

The disk cache could just be the DiskBackend, but I'm not sure how to make the test suite switch backends in the middle of all the tests, conditional on their being some stored results to use.

Alternatively the disk cache could just be a file or directory. To avoid race conditions and locked files, maybe the directory could just contain a bunch of pickled files whose file names are the hashes of the run-time parameters. The run method would check to see if any of the files in that directory matched the hash of the current run-time parameters. If they did, it would just read those results. If they didn't, it would run the simulation and then pickle the results in a file with a name matching the hash of the run-time parameters. Probably two processors would not be trying to pickle a file with the same name at the same time, and if they did, it would only be when run-time parameters were for some reason, in which case the files would be redundant anyway. The directory name for those files could be determined when the model was instantiated on the first processor and created only at that time.

Another alternative is a database, but I don't want the complexity of a MySQL involved, and last time I checked sqlite could not handle concurrent writes very well.

russelljjarvis commented 7 years ago

Yes, I mean as far as the GA is concerned, it shouldn't really sample the same parameter set twice. Although technically there is no rule to exclude the possibility of a duplicated sampling. Additionally mutation, cross over and crowding distances act as repulsive forces, that promote exploration. In terms of mutation genes/parameter values are altered by drawing from psuedo random processe each of which generate approximately real numbers inside a discrete interval.

I think its statistically very unlikely that the GA would sample exactly the same parameters twice, however it is very likely that the GA might evaluate models with parameter sets that are very close and approximately equal to previous values. It might be a better idea to check if anyone gene is different above a threshold to previous genes and to not even explore genes that are not substantially different to pre-evaluated genes.

For the GA case, I don't think caching the runs to disk out weighs the over head, but I do think caching the simulation results to memory within execution of the same test suite does outweigh the over head.

rgerkin commented 7 years ago

@russelljjarvis Back to the original topic, I have a created a lot of new tests in unit_test/core_tests.py. They are all passing for me right now. The pattern I am using is:

cd unittest
python -m unittest -bv core_tests.py

You can also change it to -bvf if you want it to stop after this first error (if you have one). Some tests are currently skipped by design.

If you use the abstract TestsTestCase class in core_tests, you will be able to create even more tests that follow the same design. You will probably want to put these in a different module that isn't called in the travis.yml file, since your tests will probably be too computationally intensive.

russelljjarvis commented 7 years ago

AOK. Sounds good. Also I pulled again from scidash I can see the new test organization: fi.py, waveform.py etc.

rgerkin commented 7 years ago

@russelljjarvis See the new modular structure of the unit_test directory here.
From the main neuronunit test directory, you can run e.g. python -m unittest -bv unit_test.core_tests to run all the unittest classes imported in core_tests, or you can run a subset of tests, e.g. python -m unittest -bv unit_test.doc_tests.

We can also now test for coverage:
coverage run --source=neuronunit -m unit_test.core_tests -bv coverage report

Name                                         Stmts   Miss  Cover
----------------------------------------------------------------
neuronunit/__init__.py                           9      3    67%
neuronunit/aibs.py                              57     36    37%
neuronunit/bbp.py                               73     46    37%
neuronunit/capabilities/__init__.py             52     14    73%
neuronunit/capabilities/channel.py              14      5    64%
neuronunit/capabilities/spike_functions.py      63     10    84%
neuronunit/models/__init__.py                  103     26    75%
neuronunit/models/backends.py                  243    168    31%
neuronunit/models/channel.py                    47     47     0%
neuronunit/models/reduced.py                    32      3    91%
neuronunit/neuroconstruct/__init__.py            9      2    78%
neuronunit/neuroconstruct/capabilities.py        0      0   100%
neuronunit/neuroconstruct/models.py            133    133     0%
neuronunit/neuroelectro.py                     260    142    45%
neuronunit/plottools.py                        794    794     0%
neuronunit/tests/__init__.py                     4      0   100%
neuronunit/tests/base.py                        93     41    56%
neuronunit/tests/channel.py                     85     85     0%
neuronunit/tests/dynamics.py                    69     46    33%
neuronunit/tests/fi.py                         228    202    11%
neuronunit/tests/passive.py                    124      6    95%
neuronunit/tests/waveform.py                    80      4    95%
----------------------------------------------------------------
TOTAL                                         2572   1813    30%

30% is sort of crappy, but better than nothing.

russelljjarvis commented 7 years ago

Thats interesting. Is it that it compares the directory full of source code with the methods found in the unit_tests, and the coverage is how much overlap there is in both.

rgerkin commented 7 years ago

@russelljjarvis I added coveralls support for SciUnit, as well as a bunch of unit tests that now cover 83% of it. You can see the report here, which is automatically generated every time Travis runs (and all the unit tests pass). I don't know how high we should aim with NeuronUnit, but I think we can definitely get over 60%.

rgerkin commented 7 years ago

Once #123 is implemented we will be able to run the slow unit tests (e.g. searching for the rheobase) and the ones that require parallel execution instead of just the fast serial ones.

rgerkin commented 5 years ago

For some reason the serial rheobase test is not being tested right now. This needs to be added (which it will during code coverage checks before release).