Test Suites should be dictionaries not lists.

scidash / neuronunit

A package for data-driven validation of neuron and ion channel models using SciUnit

http://neuronunit.scidash.org

38 stars 24 forks source link

Test Suites should be dictionaries not lists. #233

Closed russelljjarvis closed 4 years ago

russelljjarvis commented 4 years ago

I think it is better to have test suites be dictionaries rather than a list. When using NU tests for optimization, you often want to know how one particular test in the suite is going, you want a way to refer to that test programmatically without keeping track of the list index for that test, which should be arbitrary.

rgerkin commented 4 years ago

This would be a fundamental SciUnit change that I cannot easily make while other libraries depend on it. However, you can always access an element of a list by its attributes, e.g. [test for test in suite.tests if test.some_attribute == some_value], and then access the first value of that new list. Or if it has zero (no such test found) or >=2 elements (no such unique test) you could raise an Exception.

There are several candidates for that attribute. If you name the tests in your suite (with the name keyword on Test instantiation) then you can just use Test.name. Or there are other identifiers to choose from, depending on whether you want a unique object in memory (Test._id) or a hash of its attributes (Test.hash), or a serialization of it (Test.json), or some state defined at the level of the test class (Test.state). You can see many of these in methods here the base class for Models, Tests, etc. Specific implementation changes can be made at the level of the Test class, especially for something like state. But name might be the easiest since you are shuffling and regenerating tests in memory, and there could be attributes lost or added on pickling that those methods are not catching, as they were written with serial applications in mind.

rgerkin commented 4 years ago

Actually, I can make this even easier. I'll just change the __getitem__ of sciunit.TestSuite so that if you try to access a string, e.g. MySuite['smith'] then it will look for the test in that suite with the name "smith" and return it. Then from the user perspective it is as if the TestSuite was just a dictionary. Then the only user obligation is to name the tests, which is no different than giving them keys if you are using an actual dictionary. I'll report here when I've finished this.

rgerkin commented 4 years ago

OK, that was easy. In sciunit dev branch (implemented in scidash/sciunit@deb589dd706567a26481de1b9928a982815ff092), you can now do this:

from sciunit import Test, TestSuite
empty_observation = {}
test1 = Test(empty_observation, name='smith')
test2 = Test(empty_observation, name='jones')
suite = TestSuite([test1, test2])
x = suite['smith']
x == test1, x == test2

Which will print True, False since x refers to test1 (with name 'smith') and not to test2. So this is basically equivalent to the TestSuite being a dictionary, as long as the keys you want to use are set as the test names.

russelljjarvis commented 4 years ago

Okay, can we keep this issue open until I establish if this syntax works? You proposed that optimize should be a method of a sciunit test suite. I am trying to achieve this through inheritance.

Now that you have changed suites internal code, maybe method binding would be a better solution. When TSD inherited from dictionary self.update(tests) would appropriately package tests as key item pairs.

from sciunit import TestSuite
class TSD(TestSuite):
    def __init__(self,tests={},use_rheobase_score=False):
       self.DO = None
       super(TSD,self).__init__(tests)
       self.update(tests)
       if 'name' in self.keys():
           self.cell_name = tests['name']
           self.pop('name',None)
       else:
           self.cell_name = 'simulated data'
       self.use_rheobase_score = use_rheobase_score
       self.elaborate_plots  = elaborate_plots
       self.backend = None
    def optimize(self,param_edges,backend=None,protocol={'allen': False, 'elephant': True},\

rgerkin commented 4 years ago

Sure, I've reopened it. But based on the changes I made above in scidash/sciunit@deb589d, you won't need to call self.update(tests), because you automatically now get key-based access to tests in anything that inherits from TestSuite. And you would access names like you do in a TestSuite itself, by the name attribute (which means the signature of the constructor should probably match that of TestSuite up until the additional kwargs like use_rheobase_score. So it would be something like:

class TSD(TestSuite):
    def __init__(self, *args, **kwargs, use_rheobase_score=False, elaborate_plots=False):
       super(TSD, self).__init__(*args, **kwargs)
       self.DO = None
       self.use_rheobase_score = use_rheobase_score
       self.elaborate_plots = elaborate_plots
    def optimize(self, param_edges, protocol={'allen': False, 'elephant': True}):
      ...

where I took out backend because I think that should be an attribute of the model being tested, not the test or test suite.

russelljjarvis commented 4 years ago

Hey Rick, to properly fit my use case, I may need a bit more developmental support.

My use case is cell 11.

I may need you to help design a derived class TSS. That inherits from TestSuite, but has an optimize method.

You can see my stumbling attempt to do that here.

What I want to do is take a collection of sciunit tests, that are already dictionaries and to construct a test suite out of it (something that behaves like a dictionary). I think the constructor should be able to accept both lists and dictionaries. It should sense if it receives a dictionary type and just converts dict to list if appropriate.

I think what exists now, is a suite, that has key,value behavior (like a dictionary), but its constructed only from lists.

The other thing that would help is if upon construction if I could pass in the user-defined optimization method to the constructor as a constructor method?

That would help to circumvent messy post hoc method binding, that you can see in the notebooks here.

    tests= test_frame['Neocortex pyramidal cell layer 5-6']
    tests['name'] = 'Neocortex pyramidal cell layer 5-6'

    tt = list(tests.values())[0:-1]
    tt = TestSuite(tt)
    TD = TSD(tests)
    tt.optimize = MethodType(TD.optimize,tt)    
    return tt

def optimize_selection(backend,cell_id= None): 
    #code; 
    NGEN = 8
    MU = 4
    """
    tests= self.test_frame['Hippocampus CA1 pyramidal cell']
    tests['name'] = 'Hippocampus CA1 pyramidal cell'
    a = TestSuite(tests)
    a.optimize = MethodType(optimize,a)
    tests = TSD(tests = tests,use_rheobase_score=True)
    """
    tt = process_tests()backend,protocol={'allen': False, 'elephant': True
    param_edges = model_parameters.MODEL_PARAMS[backend]
    out = tt.optimize(param_edges,MU=5,NGEN=5,free_params=param_edges.keys())
    return out  
    #return optimized_models, passive_

rgerkin commented 4 years ago

Addressed in russelljjarvis/neuronunit_opt@3ee0f2db709252dc02e34c8518a74f845238e90e and subsequent commits.