Avoid use of Numpy nan statistics if possible

toni-neurosc commented 10 months ago

Simple change to check for NaN's before using Numpy's nanmean, nanmedian and nanstd functions, and use their non-nanproof versions when possible. In my test is almost twice as fast, but of course this could be worse performance wise in the case that most/all of the batches contain NaNs, but the cost of checking for NaN's is low so it's probably faster in most cases, although this should be tested with real data containing NaN's.

I also tested different ways to check for NaN's and the fastest was given by this StackOverflow comment

Minor change to add a match/case block to check normalization method, although the speed improvement is not relevant here it fits better imho.

Comparison of time spent on _normalize_and_clip function. Parameters are: NUM_CHANNELS = 5 NUM_DATA = 100000 sfreq = 1000 # Hz feature_freq = 3 # Hz n = 3 # Repeats

With Numpy nan functions:

ncalls  tottime  percall  cumtime  percall filename:lineno(function)
 2970    0.113    0.000   30.866    0.010 nm_normalization.py:141(_normalize_and_clip)
 2970    0.047    0.000   20.850    0.007 nanfunctions.py:1778(nanstd)
 2970    9.525    0.003   20.803    0.007 nanfunctions.py:1617(nanvar)
 5940    4.958    0.001   10.295    0.002 nanfunctions.py:68(_replace_nan)
 2970    0.406    0.000    9.605    0.003 nanfunctions.py:952(nanmean)
14850    0.064    0.000    9.044    0.001 fromnumeric.py:2177(sum)
17820    0.115    0.000    8.999    0.001 fromnumeric.py:71(_wrapreduction)
20790    8.874    0.000    8.874    0.000 {method 'reduce' of 'numpy.ufunc' objects}
 8910    5.337    0.001    5.337    0.001 {built-in method numpy.array}

Checking for NaN's first (here nan_std, nan_mean are my custom functions):

 ncalls  tottime  percall  cumtime  percall filename:lineno(function)
 2970    0.085    0.000   17.506    0.006 nm_normalization.py:153(_normalize_and_clip)
 2970    0.052    0.000   12.598    0.004 nm_normalization.py:147(nan_std)
 2970    0.015    0.000   10.342    0.003 fromnumeric.py:3513(std)
 2970    0.063    0.000   10.327    0.003 _methods.py:204(_std)
 2970    6.423    0.002   10.264    0.003 _methods.py:135(_var)
14850    5.403    0.000    5.403    0.000 {method 'reduce' of 'numpy.ufunc' objects}
 5940    4.589    0.001    4.589    0.001 {built-in method builtins.sum}
 2970    0.069    0.000    4.475    0.002 nm_normalization.py:144(nan_mean)
 2970    0.014    0.000    1.876    0.001 fromnumeric.py:3385(mean)
 2970    0.091    0.000    1.862    0.001 _methods.py:101(_mean)
 2970    0.031    0.000    0.332    0.000 nm_normalization.py:242(_clip)
 2970    0.108    0.000    0.262    0.000 type_check.py:403(nan_to_num)
 5940    0.024    0.000    0.141    0.000 fromnumeric.py:2322(any)
 5940    0.047    0.000    0.117    0.000 fromnumeric.py:71(_wrapreduction)

toni-neurosc commented 10 months ago

I have added an additional commit to simplify the code for the _normalize_and_clip function and remove some redundant code. Feel free to reject the change if you don't like this way of doing things.

The change was just a little exercise for me to attempt to make the code branch-less, but this is not a case where this would make a difference in performance anyway, so in the end is just making the code a bit less verbose. The code is branchless nonetheless so that's neat I guess?

Btw, I had to change normalization method to "quantile" in order to test that the code was working correctly:

settings = nm.nm_settings.get_default_settings()
settings["feature_normalization_settings"]["normalization_method"] = "quantile"
settings["raw_normalization_settings"]["normalization_method"] = "quantile"
settings["preprocessing"][0] = "raw_normalization"

and in doing so I noticed that sklearn is calling np.nanpercentile, which is eating 76% of the function execution time

   ncalls  tottime  percall  cumtime  percall filename:lineno(function)
      534    0.046    0.000  175.367    0.328 nm_normalization.py:152(_normalize_and_clip)
      534    0.007    0.000  138.904    0.260 base.py:1135(wrapper)
      534    0.007    0.000  138.743    0.260 _data.py:2632(fit)
      534    2.326    0.004  138.602    0.260 _data.py:2555(_dense_fit)
  1392233    4.133    0.000  135.015    0.000 nanfunctions.py:1228(nanpercentile)

This is also true for the "robust" normalization method:

   ncalls  tottime  percall  cumtime  percall filename:lineno(function)
      540    0.039    0.000  122.450    0.227 nm_normalization.py:152(_normalize_and_clip)
      540    0.006    0.000  121.982    0.226 base.py:1135(wrapper)
      540    3.079    0.006  121.843    0.226 _data.py:1513(fit)
  1412100    6.182    0.000  113.983    0.000 nanfunctions.py:1228(nanpercentile)

nm_bursts is also affected by the high computational cost of calculating percentiles, and I have been looking at optimizing that without success so far

timonmerk commented 9 months ago

That's a great addition @toni-neurosc! I wasn't aware that the numpy nan_mean function is slower than the simple mean function.

Realistically spoken, raw data will have rarely None's. That depends a bit on the used amplifier, but it's quite rare that samples are missed. That might change however if different streams modalities are merged (e.g. movement trace, video recordings etc.) So it's good to check that nevertheless.

For the feature normalization, that's a bit different. I recognized that around 5% of the FOOOF fits fail, resulting in None features.

Also I like the branchless design! I still remember when Python introduced that, so it's a bit of a relict of the <3.10 Python version time :)

One other point, I also realized is that the quantile and robust normalizations are quite slow. It is however something that would be great to adress. Maybe a simpler "percentile" transformation would be more optimal.. Commonly we face the problem that there are movement/cable noise artifacts that exceed many standard deviations of the data. So we would still do the z-score / mean transformation; but take only the 25th till 75th percentile s.t. outliers are excluded.

neuromodulation / py_neuromodulation

Avoid use of Numpy nan statistics if possible #285