Open domna opened 2 months ago
I will take a look, I should have a profiling notebook somewhere.
Interestingly if i run the benchmark locally (Fedora, Python 3.10 and 3.12) I don't get any performance regression. Maybe it's a Problem with the runner.
------------------------------------------------------------
Name (time in ms) OPS(2.0.0) OPS(1.26.4)
------------------------------------------------------------
test_solver2x2 513.6902 (1.0) 422.0736 (1.0)
test_solver4x4_linear 379.5551 (0.74) 292.9602 (0.69)
test_solver4x4_eig 65.4104 (0.13) 49.6542 (0.12)
test_solver4x4_expm 3.5880 (0.01) 3.5572 (0.01)
------------------------------------------------------------
I see issues locally, too
------------------------------------------------------------------------------------------ benchmark: 6 tests -----------------------------------------------------------------------------------------
Name (time in ms) Min Max Mean StdDev Median IQR Outliers OPS Rounds Iterations
-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
test_solver2x2 3.3913 (1.0) 11.2985 (1.30) 4.6075 (1.0) 2.4295 (3.04) 3.6041 (1.0) 1.4018 (4.11) 1;1 217.0365 (1.0) 10 1
test_solver4x4_linear 3.7497 (1.11) 8.9823 (1.04) 4.7736 (1.04) 1.6175 (2.02) 4.0946 (1.14) 0.9789 (2.87) 1;1 209.4871 (0.97) 10 1
test_formula_solver2x2 5.9750 (1.76) 8.6650 (1.0) 6.4390 (1.40) 0.8001 (1.0) 6.2300 (1.73) 0.3410 (1.0) 1;1 155.3033 (0.72) 10 1
test_solver4x4_eig 42.4544 (12.52) 67.9676 (7.84) 47.8688 (10.39) 8.3157 (10.39) 44.0447 (12.22) 6.4657 (18.96) 2;1 20.8905 (0.10) 10 1
test_solver4x4_expm 449.6212 (132.58) 540.5948 (62.39) 471.5749 (102.35) 28.4518 (35.56) 462.2354 (128.25) 20.4840 (60.07) 1;1 2.1206 (0.01) 10 1
test_formula_solver4x4_expm 457.4233 (134.88) 800.8481 (92.42) 506.0184 (109.82) 104.2926 (130.35) 477.2739 (132.43) 24.4018 (71.56) 1;1 1.9762 (0.01) 10 1
-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
---------------------------------------------------------------------------------------- benchmark: 6 tests ----------------------------------------------------------------------------------------
Name (time in ms) Min Max Mean StdDev Median IQR Outliers OPS Rounds Iterations
----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
test_solver2x2 3.7060 (1.0) 5.8486 (1.0) 4.0890 (1.0) 0.6535 (1.06) 3.8295 (1.0) 0.4599 (1.05) 1;1 244.5583 (1.0) 10 1
test_solver4x4_linear 4.3303 (1.17) 5.9519 (1.02) 4.9078 (1.20) 0.6175 (1.0) 4.6249 (1.21) 1.2390 (2.83) 3;0 203.7588 (0.83) 10 1
test_formula_solver2x2 5.9571 (1.61) 8.4977 (1.45) 6.4565 (1.58) 0.7682 (1.24) 6.1007 (1.59) 0.4382 (1.0) 1;1 154.8831 (0.63) 10 1
test_solver4x4_eig 114.9540 (31.02) 132.4588 (22.65) 118.5942 (29.00) 6.1394 (9.94) 115.7738 (30.23) 1.1952 (2.73) 2;2 8.4321 (0.03) 10 1
test_solver4x4_expm 393.4410 (106.16) 413.0689 (70.63) 403.1043 (98.58) 5.6653 (9.17) 402.3795 (105.07) 8.5332 (19.47) 2;0 2.4807 (0.01) 10 1
test_formula_solver4x4_expm 397.5010 (107.26) 410.9472 (70.26) 402.0846 (98.33) 4.2841 (6.94) 401.0340 (104.72) 3.6075 (8.23) 3;1 2.4870 (0.01) 10 1
----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
But mainly for the eigen value solver
The eigenvalue solver is getting faster! In the CI benchmark, the Expm broke down by a factor of 4, which we both can not reproduce.
Edit: The culprit is SciPy. The regression was introduced in between 1.12 and 1.13.
Oopsie, true. Sorry, just quickly checked on the side today and I just saw that something changed. I can confirm that scipy==0.12
is faster by a factor of 4 for me, too.
Here are the stats with numpy-1.26.4
and scipy-1.12.0
:
----------------------------------------------------------------------------------------- benchmark: 6 tests -----------------------------------------------------------------------------------------
Name (time in ms) Min Max Mean StdDev Median IQR Outliers OPS Rounds Iterations
------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
test_solver2x2 3.6657 (1.0) 4.1063 (1.0) 3.7916 (1.0) 0.1357 (1.0) 3.7635 (1.0) 0.1749 (1.18) 1;0 263.7438 (1.0) 10 1
test_solver4x4_linear 4.4456 (1.21) 5.0880 (1.24) 4.6169 (1.22) 0.1938 (1.43) 4.5882 (1.22) 0.1486 (1.0) 1;1 216.5961 (0.82) 10 1
test_formula_solver2x2 5.8515 (1.60) 9.1948 (2.24) 6.4353 (1.70) 1.0004 (7.37) 6.0991 (1.62) 0.4927 (3.32) 1;1 155.3918 (0.59) 10 1
test_solver4x4_expm 81.0214 (22.10) 99.6861 (24.28) 87.8797 (23.18) 6.2004 (45.69) 84.9936 (22.58) 8.0749 (54.35) 3;0 11.3792 (0.04) 10 1
test_formula_solver4x4_expm 82.5994 (22.53) 197.2490 (48.04) 105.7262 (27.88) 34.1210 (251.45) 94.7500 (25.18) 28.3755 (190.97) 1;1 9.4584 (0.04) 10 1
test_solver4x4_eig 115.4885 (31.51) 126.0902 (30.71) 117.8930 (31.09) 3.9314 (28.97) 116.0587 (30.84) 0.9391 (6.32) 2;2 8.4823 (0.03) 10 1
------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
I did not really find anything on a major change of expm
in scipy 1.13
but there are some bugfixes on it in the release notes. Maybe we should build a minimal example just for expm and open an issue at scipy with it.
Indeed when I run this
import time
import numpy as np
from scipy.linalg import expm
mat = np.array(
[
[0.0 - 0.0j, 0.0 - 0.0j, 0.0 - 0.0j, 0.0 - 2.5774387j],
[0.0 - 0.0j, 0.0 - 0.0j, 0.0 + 4.3402231j, 0.0 - 0.0j],
[0.0 - 0.0j, 0.0 + 5.60367362j, 0.0 - 0.0j, 0.0 - 0.0j],
[0.0 - 9.43618706j, 0.0 - 0.0j, 0.0 - 0.0j, 0.0 - 0.0j],
]
)
start_time = time.time()
for _ in range(10000):
expm(mat)
print(f"{(time.time() - start_time) * 1000} ms")
I get ~360 ms
for scipy < 1.13 and 1800 ms
for scipy >=1.13.
hmm... I had a look at the changes of expm
between v1.13.1
and v1.12.0
and the only thing that changes was that there was an explicit formula for 2x2 matrices which has been removed. So there must be a change in some other function expm calls which slows it down.
Yeah, it's not directly obvious what they changed. We can try to report the regression.
On the other hand, I did try out PyTorch again. They have improved performance massively. For me, it is comparable to the eigenvalue solver. Maybe we should make it an alternative again?
-------------------------------------------------
Name (time in ms) OPS
-------------------------------------------------
test_solver2x2 523.0784 (1.0)
test_solver4x4_linear 476.9422 (0.91)
test_formula_solver2x2 279.7011 (0.53)
test_solver4x4_torch_expm 63.3393 (0.12)
test_solver4x4_eig 67.1695 (0.13)
test_solver4x4_expm 4.3523 (0.01)
test_formula_solver4x4_expm 5.7128 (0.01)
-------------------------------------------------
Yeah, it's not directly obvious what they changed.
On the other hand, I did try out PyTorch again. They have improved performance massively. For me, it is comparable to the eigenvalue solver. Maybe we should make it an alternative again?
Yes, sounds good to have an alternative. However, I also tried to use torch
and ran the above example and got even worse performance (~2500 ms I think it was). Can you share your code and libs? Then I'll try to reproduce it.
I would also open an issue at scipy to start a discussion there at least. Maybe someone there might just know what's going wrong.
I have prepared a branch with Pytorch at #181
With ~
numpy==2.0.0
~scipy==1.13.1
the performance of the solver drops tremendously. We should investigate where this is coming from and see if we can fix this by replacing to the proper (quicker) functions.