scidash / neuronunit

A package for data-driven validation of neuron and ion channel models using SciUnit
http://neuronunit.scidash.org
38 stars 24 forks source link

Caching simulator runs within tests. #116

Closed russelljjarvis closed 7 years ago

russelljjarvis commented 7 years ago

In current use-case of neuronunit, 8 tests all draw from only 2 types of current injection type (2 simulator runs).

Action of the NSGA algorithm is supposed to avoid sampling twice from exactly the same parameter space, so I do not think it is worth pursuing a memory intensive option of storing all V_{m}'s in a dictionary where keys to those dictionaries are model attributes. As alternative to model_attribute:model_results hashing I just want to define a model backend, that stores runs in a dictionary, where the only two keys in the dictionary are two types of current injection used.

To achieve the end of caching runs per current injection type. I created a Memory backend that inherits from NEURONBackend, and over writes its inject_sqaure_current, and local_run methods. See below:

class MemoryBackend(NEURONBackend):
    """A dummy backend that loads pre-computed results from RAM/heap"""

    def init_backend(self, results_path='.'):
        self.model.rerun = True
        self.model.results = None
        self.model.cached_params = {}
        self.model.cached_attrs = {}
        self.current = {}

        super(MemoryBackend,self).init_backend()

    def inject_square_current(self, current):
        self.h = None
        self.neuron = None
        import neuron
        self.reset_neuron(neuron)
        self.set_attrs(**self.attrs)

        c = copy.copy(current)
        if 'injected_square_current' in c.keys():
            c = current['injected_square_current']

        c['delay'] = re.sub('\ ms$', '', str(c['delay']))
        c['duration'] = re.sub('\ ms$', '', str(c['duration']))
        c['amplitude'] = re.sub('\ pA$', '', str(c['amplitude']))
        #Todo want to convert from nano to pico amps using quantities.
        amps=float(c['amplitude'])/1000.0 #This is the right scale.
        prefix = 'explicitInput_%s%s_pop0.' % (self.current_src_name,self.cell_name)
        self.h(prefix+'amplitude=%s'%amps)
        self.h(prefix+'duration=%s'%c['duration'])
        self.h(prefix+'delay=%s'%c['delay'])

        #
        # make this current injection value a class attribute, such that its remembered.
        #
        self.current = current    

    def local_run(self):
        #
        # Logic for choosing if a injection value, has already been tested.
        #
        if self.current not in self.model.cached_params: 
            self.h('run()')
            results={}
            results['vm'] = [float(x/1000.0) for x in copy.copy(self.neuron.h.v_v_of0.to_python())]
            results['t'] = [float(x) for x in copy.copy(self.neuron.h.v_time.to_python())]

            if 'run_number' in results.keys():
                results['run_number'] = results['run_number']+1
            else:
                results['run_number'] = 1

            self.model.cached_params[str(current)] = {}
            self.model.cached_params[str(current)]=results
            return self.model.cached_params[str(current)].results
        else:
            return self.model.cached_params[str(current)].results

Note also I think this is also the right approach to caching per model attribute simulation runs (something I have no interest in, since I think it will slow down the GA action rather than speeding it up). Any attribute of the neuronunit model, such as a caching dictionary can be shifted into a data transport container (dtc) object. A pattern for achieving this would be:

dtc.cached_attrs.update(model.cached_attrs)

There are many oppurtunities when all the dtc's exist togethor in a list on CPU0 that list is called dtcpop. To federate all of these dictionaries, such that each dtc, knows what its siblings have also previously run, you can just do something like:


for i,dtci in enumerate(dtcpop):
     #count up
     for j, dtcj in enumerate(dtcpop):
          #count up
          if i!=j:
               dtcpop[i].cached_attrs.update(dtcpop[j].cached_attrs)

My feeling is that somewhere in neuronunit there is a list of all the allowed backends, and the only two existing items are jNeuroMLBackend and NEURONbackend.

I wonder if you know of any places I should edit to help neuronunit sense/permit these DiskBackend and MemoryBackend?

The first problem with this approach is getting neuronunit to sense that this backend is available.

Printing dir on the backends module shows all the relevant classes (ie memory backend) etc:

If I execute `dir(backends)`:
In [9]: backends
Out[9]: 
In [10]: dir(backends)
Out[10]: 
['AnalogSignal',
 'Backend',
 'DiskBackend',
 'HasSegment',
 'MemoryBackend',
 'NEURONBackend',
 'SingleCellModel',
...,
 'jNeuroMLBackend',
...']

If I try to instantiate with DiskBackend:

In [8]:     from neuronunit.models import backends
   ...:     from neuronunit.models.reduced import ReducedModel
   ...:     import quantities as pq
   ...:     import numpy as np
   ...:     from neuronunit.tests import get_neab
   ...:     #model = ReducedModel(get_neab.LEMS_MODEL_PATH,name=str('vanilla'),backend='NEURON')
   ...:     model = ReducedModel(get_neab.LEMS_MODEL_PATH,name=str('vanilla'),backend='DiskBackend')
   ...:     
   ...: 
---------------------------------------------------------------------------
Exception                                 Traceback (most recent call last)
<ipython-input-8-bc2b51b3fc87> in <module>()
      5 from neuronunit.tests import get_neab
      6 #model = ReducedModel(get_neab.LEMS_MODEL_PATH,name=str('vanilla'),backend='NEURON')
----> 7 model = ReducedModel(get_neab.LEMS_MODEL_PATH,name=str('vanilla'),backend='DiskBackend')
      8 

/home/jovyan/neuronunit/neuronunit/models/reduced.py in __init__(self, LEMS_file_path, name, backend, attrs)
     22         """
     23         super(ReducedModel,self).__init__(LEMS_file_path,name=name,
---> 24                                           backend=backend,attrs=attrs)
     25         self.run_number = 0
     26         self.tstop = None

/home/jovyan/neuronunit/neuronunit/models/__init__.py in __init__(self, LEMS_file_path, name, backend, attrs)
     35         self.skip_run = False
     36         self.rerun = True # Needs to be rerun since it hasn't been run yet!
---> 37         self.set_backend(backend)
     38 
     39     def __new__(cls, *args, **kwargs):

/home/jovyan/neuronunit/neuronunit/models/__init__.py in set_backend(self, backend)
     87         else:
     88             raise Exception("Backend %s not found in backends.py" \
---> 89                             % name)
     90         self._backend.model = self
     91         self._backend.init_backend(*args, **kwargs)

Exception: Backend DiskBackend not found in backends.py

If I try to instance with MemoryBackend:

In [1]:     from neuronunit.models import backends
   ...:     from neuronunit.models.reduced import ReducedModel
   ...:     import quantities as pq
   ...:     import numpy as np
   ...:     from neuronunit.tests import get_neab
   ...:     #model = ReducedModel(get_neab.LEMS_MODEL_PATH,name=str('vanilla'),backend='NEURON')
   ...:     model = ReducedModel(get_neab.LEMS_MODEL_PATH,name=str('vanilla'),backend='MemoryBackend')
   ...: 
Getting Rheobase cached data value for from AIBS dataset 354190013
attempting to recover from pickled file
---------------------------------------------------------------------------
AssertionError                            Traceback (most recent call last)
<ipython-input-1-4a795ec40e2b> in <module>()
      5 from neuronunit.tests import get_neab
      6 #model = ReducedModel(get_neab.LEMS_MODEL_PATH,name=str('vanilla'),backend='NEURON')
----> 7 model = ReducedModel(get_neab.LEMS_MODEL_PATH,name=str('vanilla'),backend='MemoryBackend')

/home/jovyan/neuronunit/neuronunit/models/reduced.py in __init__(self, LEMS_file_path, name, backend, attrs)
     22         """
     23         super(ReducedModel,self).__init__(LEMS_file_path,name=name,
---> 24                                           backend=backend,attrs=attrs)
     25         self.run_number = 0
     26         self.tstop = None

/home/jovyan/neuronunit/neuronunit/models/__init__.py in __init__(self, LEMS_file_path, name, backend, attrs)
     28         self.attrs = attrs if attrs else {}
     29         self.orig_lems_file_path = os.path.abspath(LEMS_file_path)
---> 30         assert os.path.isfile(self.orig_lems_file_path)
     31         self.run_defaults = pynml.DEFAULTS
     32         self.run_defaults['nogui'] = True

AssertionError: 

In [2]:         get_neab.LEMS_MODEL_PATH = '/home/jovyan/neuronunit/neuronunit/optimization/NeuroML2/LEMS_2007One.xml'
   ...: 

In [3]: 

In [3]:     from neuronunit.models import backends
   ...:     from neuronunit.models.reduced import ReducedModel
   ...:     import quantities as pq
   ...:     import numpy as np
   ...:     from neuronunit.tests import get_neab
   ...:     #model = ReducedModel(get_neab.LEMS_MODEL_PATH,name=str('vanilla'),backend='NEURON')
   ...:     model = ReducedModel(get_neab.LEMS_MODEL_PATH,name=str('vanilla'),backend='MemoryBackend')
   ...: 
---------------------------------------------------------------------------
Exception                                 Traceback (most recent call last)
<ipython-input-3-4a795ec40e2b> in <module>()
      5 from neuronunit.tests import get_neab
      6 #model = ReducedModel(get_neab.LEMS_MODEL_PATH,name=str('vanilla'),backend='NEURON')
----> 7 model = ReducedModel(get_neab.LEMS_MODEL_PATH,name=str('vanilla'),backend='MemoryBackend')

/home/jovyan/neuronunit/neuronunit/models/reduced.py in __init__(self, LEMS_file_path, name, backend, attrs)
     22         """
     23         super(ReducedModel,self).__init__(LEMS_file_path,name=name,
---> 24                                           backend=backend,attrs=attrs)
     25         self.run_number = 0
     26         self.tstop = None

/home/jovyan/neuronunit/neuronunit/models/__init__.py in __init__(self, LEMS_file_path, name, backend, attrs)
     35         self.skip_run = False
     36         self.rerun = True # Needs to be rerun since it hasn't been run yet!
---> 37         self.set_backend(backend)
     38 
     39     def __new__(cls, *args, **kwargs):

/home/jovyan/neuronunit/neuronunit/models/__init__.py in set_backend(self, backend)
     87         else:
     88             raise Exception("Backend %s not found in backends.py" \
---> 89                             % name)
     90         self._backend.model = self
     91         self._backend.init_backend(*args, **kwargs)

Exception: Backend MemoryBackend not found in backends.py

In [4]: 
rgerkin commented 7 years ago

@russelljjarvis The relevant code is here where the options dictionary is calculated. Note that the options dictionary replaces "Backend" with "", so the name stored in each key would be e.g. "jNeuroML" instead of "jNeuroMLBackend". Similarly, the name you want to use for DiskBackend is "Disk" and for MemoryBackend it is "Memory".

Also, local_run() in your backends appears to make calls that are also made in the real NEURONBackend (in the case where the cached value is not found). Can you change local_run to make a call to super() in that case, so that you don't have duplicated code?

russelljjarvis commented 7 years ago

Thanks for setting me straight about the Disk, and Memory way of instancing. Ie the actual Backend part of the string is stripped away.

AOK. I think I see what your saying. I can de-duplicate a lot of code if inside the over-riddenlocal_run method I call the super(MemoryBackend,self).local_run().

It confuses me a bit, because I think of overriding as destructively replacing the parent classes method, but I can sort of see how you could still refer to the parents method anyway.

    def local_run(self):
        if self.current not in self.model.cached_params:
            results = super(MemoryBackend,self).local_run()#
            self.model.cached_params[str(current)] = {}
            self.model.cached_params[str(current)]=results
            return self.model.cached_params[str(current)].results
        else:
            return self.model.cached_params[str(current)].results
russelljjarvis commented 7 years ago

Also on the issue of caching per model attribute, results/V_{s}. I thought, that I should try to verify my claim, that the same model attributes are not re-evaluated frequently enough, that the cost of storing the simulation results would out weigh performance costs of de duplication of workload.

To verify. I could have a dictionary self.model.cached_attrs that I use to count the number of times the a particular parameter set is evaluated.

With self.model.cached_attrs[str(model.attrs)]+=1

rgerkin commented 7 years ago

@russelljjarvis I suspect the number will be low but you can certainly check.

russelljjarvis commented 7 years ago

I am also addressing this in unit testing: https://github.com/russelljjarvis/neuronunit/blob/dev/unit_test/testNEURONparallel.py#L91-L111

set is an iterable, except unlike list its composed of only unique elements, so if I cast list to set and then get the length, the length of set should not be reduced relative to before the cast if parameter values are really unique. I am performing this operation on a list of 10,000 randomly generated parameter sets, and it is passing.

Its probably more meaningful to cast the lists: pop, and ga_history_pop to set,

unique_check = pop.extend(ga_history_pop)
bfl = len(unique_check)
afl = len(set(unique_check))
self.assertEqual(bfl,afl)

to check that even as the GA population converges on a solution, that the models it's testing are unique (not redundant) enough.