AdExp optimization [Discussion]

Silmathoron commented 7 years ago

Currently, the AdExp implementation uses std::exp to compute the spike generation current.

From some stuff I was doing on NNGT, I realized that using the standard exponential is a very bad idea in terms of performance. I am therefore wondering whether it would not be a good idea to replace it with some clever approximation...

But of course, before we do that, the question is 'how much precision do we want to ensure?'. From the previous tests we ran, comparing GSL and LSODAR, I think 10e-6 would be more than sufficient.

Thus, if anyone knows a good algorithm to approximate exp to that precision, it would be nice, otherwise I propose to replace it only when the argument is smaller than 1, using the following function:

float fastexp(float x)
{
    return (40320 + x*(40320 + x*(20160 + x*(6720 + x*(1680 + 
            x*(336 + x*(56 + x*(8+x))))))))*2.4801587301e-5;
}

Relative precision is 1e-10 for x < 0.2, then increases linearly up to 1e-6 for x = 1; this is more than 50 times faster than std::exp.
Note that since the large majority of the simulation is spent with x = (V - V_th) / Delta_T < 1, this would basically speed up the simulation almost 50 times (the exponential takes up the largest part of computation time)

All opinions or proposals are welcome! ;)

heplesser commented 7 years ago

@Silmathoron This is an interesting idea. Computational neuroscience would certainly benefit from efficient hardware implementations of exp() and expm1() .... The acceleration you report is indeed amazing---is this for a single neuron or for a large network simulation?

Concerning precision: are deviations from the exact solution evenly distributed around zero? If fastexp() would be systematically too low or too high over certain ranges of the argument, this could lead to skewed results. In the end, I think we would have to see if we find any effect on relevant quantities in network simulations (rates, correlations).

Did you develop the fastexp() above yourself (and how did you get it), or do you have a source for it? I assume that there should be publications on fast approximations of the exponential out there.

Silmathoron commented 7 years ago

@heplesser these are good questions:

The speedup is for the exp alone, I did not have time to benchmark, but the std::exp does represent most of the time spent in the AdExp dynamics...
No, the equation comes from Taylor expansions, so it's systematically lower than the real value (see attached picture)
I'll try to do some tests that I'll add to the notebook when time permits.
This approximation does not come from a publication apparently. I found it on SO and was able to track it down to this original post

Silmathoron commented 7 years ago

Ok, let's make a clean update: I tested 4 approximations to exp

import struct
import numpy as np

def exp1024(x):
   res = 1. + x / 1024.
   for _ in range(10):
      res *= res
   return res

def fexp(x):
   ''' Using bits conversion '''
   tmp = np.array(1512775 * x + 1072632447, dtype=int)
   return [struct.unpack('>d', struct.pack('>q', val << 32))[0] for val in tmp]

def corrected_fexp(x):
   '''
   Take advantage of the regularity of fexp to correct the discrepancy with precomputed
   values in `ExpAdjustment`
   '''
   tmp = np.array(1512775 * x + 1072632447, dtype=int)
   index = (tmp >> 12) & 0xFF;
   return [struct.unpack('>d', struct.pack('>q', val << 32))[0] * ExpAdjustment[i] for i, val in zip(index, tmp)]

def taylor_fastexp(x):
  return (40320+x*(40320+x*(20160+x*(6720+x*(1680+x*(336+x*(56+x*(8+x))))))))*2.4801587301e-5

Which leads to the following results comparison_approx_exp

Note that for fexp, the error changes sign periodically so I took the absolute value.

For the record, the new (corrected)_fexp come from this SO post and this reply.

The correction I used was the one that was posted in the answer but it might not be perfect so we can always recompute it.

[EDIT:] there is also this set of C implementations that exists but that I did not have time to test.

heplesser commented 7 years ago

Thanks! The last link you posted not only provides C-implementations but also promises a Mathematica notebook with background information. Don't have time to look at it right now.

I found two relevant publications, with the second building on the first:

Schraudolph, N. N. A Fast, Compact Approximation of the Exponential Function Neural Comput, 1999, 11, 853-862
Cawley, G. C. On a Fast, Compact Approximation of the Exponential Function Neural Comput, 2000, 12, 2009-2012

There may be later publications building on these two.

@ingablundell Could you comment?

sepehrmn commented 7 years ago

This is very interesting, maybe it would be useful to have a generic fast-exp function in the NEST numerics file?

ingablundell commented 7 years ago

I see no problem in using one of the above approximations for exp. I would have thought the std exp is already implemented as a Taylor sum? I would have also suggested checking for C implementations. I can't really comment on computation time.

heplesser commented 6 years ago

Should be checked against NESTML.

clinssen commented 6 years ago

This issue/code and suggestions could also mutually benefit from a lookup table version of the exponential function that was recently implemented by @suku248

clinssen commented 4 years ago

NESTML is probably not the right place to address this. Where does the user decide that they want nest::fastexp() rather than std::exp()? Probably this has to be decided prior to building NEST. This would take the form of a build parameter (-Dfast-exp to cmake or similar).

I would suggest to replace all std::exp() calls in all models with nest::fastexp() calls, and then make fastexp() an inline function that has the necessary preprocessor checks; something like

inline double fastexp(const double x) {
#ifdef FAST_EXP
  // ...corrected_fexp() implementation...
#else
  return std::exp(x);
#endif
}

@gtrensch: any comments on this approach?

clinssen commented 4 years ago

In the Open NEST Developer Meeting of 6-7-2020, it was decided to:

[ ] collect implementations of various exp() implementations in libnestutil;
[ ] allow selection of a different exp() implementation during code generation with NESTML, by passing an extra parameter to the code generator.

github-actions[bot] commented 3 years ago

Issue automatically marked stale!

nest / nest-simulator

AdExp optimization [Discussion] #795