Fixes #142: Optimized batch norm evaluation

Running the provided benchmark tests

Before patch

------------------------------------------------------------------------- benchmark 'evaluate-splineobject': 6 tests -------------------------------------------------------------------------
Name (time in ms)              Min                 Max                Mean            StdDev              Median               IQR            Outliers       OPS            Rounds  Iterations
----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
test_eval                   2.1336 (1.0)        3.6244 (1.12)       2.2080 (1.0)      0.1554 (1.91)       2.1655 (1.0)      0.0470 (1.0)         22;37  452.9032 (1.0)         394           1
test_deriv                  2.1448 (1.01)       3.3083 (1.02)       2.2277 (1.01)     0.1209 (1.49)       2.1773 (1.01)     0.0896 (1.91)        53;27  448.9014 (0.99)        399           1
test_eval_rational          2.4706 (1.16)       3.2279 (1.0)        2.5387 (1.15)     0.0811 (1.0)        2.5080 (1.16)     0.0597 (1.27)        42;34  393.9039 (0.87)        343           1
test_deriv_rational         4.4386 (2.08)       6.0427 (1.87)       4.5903 (2.08)     0.2058 (2.54)       4.5086 (2.08)     0.1949 (4.15)         11;6  217.8485 (0.48)        195           1
test_tangent              180.6323 (84.66)    181.4866 (56.22)    181.0548 (82.00)    0.3635 (4.48)     181.0585 (83.61)    0.7582 (16.14)         4;0    5.5232 (0.01)          6           1
test_tangent_rational     188.8721 (88.52)    190.3722 (58.98)    189.3589 (85.76)    0.6047 (7.45)     189.0927 (87.32)    0.9195 (19.57)         1;0    5.2810 (0.01)          6           1
----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------

After patch

----------------------------------------------------------------------- benchmark 'evaluate-splineobject': 6 tests -----------------------------------------------------------------------
Name (time in ms)             Min                Max               Mean            StdDev             Median               IQR            Outliers       OPS            Rounds  Iterations
------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
test_eval                  2.1555 (1.0)       6.7014 (1.81)      2.2570 (1.00)     0.2943 (1.96)      2.1979 (1.0)      0.0671 (1.16)         9;34  443.0643 (1.00)        337           1
test_deriv                 2.1653 (1.00)      3.7060 (1.0)       2.2549 (1.0)      0.1503 (1.0)       2.2156 (1.01)     0.0580 (1.0)         18;25  443.4744 (1.0)         336           1
test_eval_rational         2.4703 (1.15)      6.2840 (1.70)      2.5772 (1.14)     0.2434 (1.62)      2.5287 (1.15)     0.0668 (1.15)        15;34  388.0203 (0.87)        330           1
test_deriv_rational        4.4355 (2.06)     28.8044 (7.77)      4.7678 (2.11)     1.8477 (12.29)     4.5338 (2.06)     0.1082 (1.87)         3;20  209.7384 (0.47)        202           1
test_tangent               6.8351 (3.17)     17.9373 (4.84)      7.1267 (3.16)     0.9794 (6.52)      6.9756 (3.17)     0.1670 (2.88)          4;9  140.3168 (0.32)        133           1
test_tangent_rational     13.8013 (6.40)     15.3544 (4.14)     14.1151 (6.26)     0.2748 (1.83)     14.0532 (6.39)     0.2666 (4.60)         12;3   70.8461 (0.16)         70           1
------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------

Results: 30 times speedup on tangent evaluation

SINTEF / Splipy

Fixes #142: Optimized batch norm evaluation #144

Before patch

After patch