Open Silmathoron opened 7 years ago
@Silmathoron This is an interesting idea. Computational neuroscience would certainly benefit from efficient hardware implementations of exp()
and expm1()
.... The acceleration you report is indeed amazing---is this for a single neuron or for a large network simulation?
Concerning precision: are deviations from the exact solution evenly distributed around zero? If fastexp()
would be systematically too low or too high over certain ranges of the argument, this could lead to skewed results. In the end, I think we would have to see if we find any effect on relevant quantities in network simulations (rates, correlations).
Did you develop the fastexp()
above yourself (and how did you get it), or do you have a source for it? I assume that there should be publications on fast approximations of the exponential out there.
@heplesser these are good questions:
std::exp
does represent most of the time spent in the AdExp dynamics...Ok, let's make a clean update: I tested 4 approximations to exp
import struct
import numpy as np
def exp1024(x):
res = 1. + x / 1024.
for _ in range(10):
res *= res
return res
def fexp(x):
''' Using bits conversion '''
tmp = np.array(1512775 * x + 1072632447, dtype=int)
return [struct.unpack('>d', struct.pack('>q', val << 32))[0] for val in tmp]
def corrected_fexp(x):
'''
Take advantage of the regularity of fexp to correct the discrepancy with precomputed
values in `ExpAdjustment`
'''
tmp = np.array(1512775 * x + 1072632447, dtype=int)
index = (tmp >> 12) & 0xFF;
return [struct.unpack('>d', struct.pack('>q', val << 32))[0] * ExpAdjustment[i] for i, val in zip(index, tmp)]
def taylor_fastexp(x):
return (40320+x*(40320+x*(20160+x*(6720+x*(1680+x*(336+x*(56+x*(8+x))))))))*2.4801587301e-5
Which leads to the following results
Note that for fexp
, the error changes sign periodically so I took the absolute value.
For the record, the new (corrected)_fexp
come from this SO post and this reply.
The correction I used was the one that was posted in the answer but it might not be perfect so we can always recompute it.
[EDIT:] there is also this set of C
implementations that exists but that I did not have time to test.
Thanks! The last link you posted not only provides C-implementations but also promises a Mathematica notebook with background information. Don't have time to look at it right now.
I found two relevant publications, with the second building on the first:
There may be later publications building on these two.
@ingablundell Could you comment?
This is very interesting, maybe it would be useful to have a generic fast-exp function in the NEST numerics file?
I see no problem in using one of the above approximations for exp. I would have thought the std exp is already implemented as a Taylor sum? I would have also suggested checking for C implementations. I can't really comment on computation time.
Should be checked against NESTML.
This issue/code and suggestions could also mutually benefit from a lookup table version of the exponential function that was recently implemented by @suku248
NESTML is probably not the right place to address this. Where does the user decide that they want nest::fastexp() rather than std::exp()? Probably this has to be decided prior to building NEST. This would take the form of a build parameter (-Dfast-exp
to cmake or similar).
I would suggest to replace all std::exp() calls in all models with nest::fastexp() calls, and then make fastexp() an inline function that has the necessary preprocessor checks; something like
inline double fastexp(const double x) {
#ifdef FAST_EXP
// ...corrected_fexp() implementation...
#else
return std::exp(x);
#endif
}
@gtrensch: any comments on this approach?
In the Open NEST Developer Meeting of 6-7-2020, it was decided to:
Issue automatically marked stale!
Currently, the AdExp implementation uses
std::exp
to compute the spike generation current.From some stuff I was doing on NNGT, I realized that using the standard exponential is a very bad idea in terms of performance. I am therefore wondering whether it would not be a good idea to replace it with some clever approximation...
But of course, before we do that, the question is 'how much precision do we want to ensure?'. From the previous tests we ran, comparing GSL and LSODAR, I think 10e-6 would be more than sufficient.
Thus, if anyone knows a good algorithm to approximate
exp
to that precision, it would be nice, otherwise I propose to replace it only when the argument is smaller than 1, using the following function:Relative precision is 1e-10 for x < 0.2, then increases linearly up to 1e-6 for
x = 1
; this is more than 50 times faster thanstd::exp
.Note that since the large majority of the simulation is spent with
x = (V - V_th) / Delta_T < 1
, this would basically speed up the simulation almost 50 times (the exponential takes up the largest part of computation time)All opinions or proposals are welcome! ;)