Closed VitorSouzaLNLS closed 5 months ago
I've made some benchmarks to test the optimisation. The benchmarks consists in two different tests: 1. a tracking comparison linked with and without rotation errors in dipoles (that modifies the polynomials A and B), and 2. a "get and set" direct access comparisson.
The tests were runned in separated conda envs: sirius
and static-sirius
. The sirius
env has the pyaccel
@ lnls-fac/pyaccel#134 and the trackcpp @ lnls-fac/trackcpp#70. The static-sirius
keeps the pyaccel and trackcpp at master.
Master packages:
seed = 123
time to set R errors: 0.0
time to track 1000 turns: 2.1539626121520996
time to remove R errors: 0.0
particle out:
+5.090343983979611584551709657642248885167646221816539764404297e-06
-2.672436247254870412272060914340293669155812494864221662282944e-09
-1.383764633161988879263114152542990531458144687348976731300354e-07
-7.187308913126152246043584770340828526968834921717643737792969e-08
+4.351952359451681170976478085776761872693896293640136718750000e-03
-1.914367496884084723918206805137742776423692703247070312500000e-02
time to set R errors: 0.11367011070251465
time to track 1000 turns: 2.1486620903015137
time to remove R errors: 0.11101984977722168
particle out:
+3.240388223648277683753768374508297256397781893610954284667969e-05
-4.024638141951858937277880512439764970622491091489791870117188e-06
+1.105937292938897808852391491107880483468761667609214782714844e-04
-3.184947121240178660945066568821459895843872800469398498535156e-05
+4.351546717365082674044973742866204702295362949371337890625000e-03
-1.915489956647268426914720862441754434257745742797851562500000e-02
The modified packages:
seed = 123
time to set R errors: 0.0
time to track 1000 turns: 2.1565747261047363
time to remove R errors: 0.0
particle out:
+5.090343983979611584551709657642248885167646221816539764404297e-06
-2.672436247254870412272060914340293669155812494864221662282944e-09
-1.383764633161988879263114152542990531458144687348976731300354e-07
-7.187308913126152246043584770340828526968834921717643737792969e-08
+4.351952359451681170976478085776761872693896293640136718750000e-03
-1.914367496884084723918206805137742776423692703247070312500000e-02
time to set R errors: 0.050624847412109375
time to track 1000 turns: 2.1532840728759766
time to remove R errors: 0.04798269271850586
particle out:
+3.240388223648277683753768374508297256397781893610954284667969e-05
-4.024638141951858937277880512439764970622491091489791870117188e-06
+1.105937292938897808852391491107880483468761667609214782714844e-04
-3.184947121240178660945066568821459895843872800469398498535156e-05
+4.351546717365082674044973742866204702295362949371337890625000e-03
-1.915489956647268426914720862441754434257745742797851562500000e-02
As seen, the "tracking" itself was preserved (performance & tracked result), while the performance in setting/adding and removing rotation errors (that requires access to polynom_a and polynom_b ~ for dipoles) had been speed-up by 2x.
(env) $ipython
In [1]: import pyaccel
In [2]: elem = pyaccel.elements.Element()
Master packages:
In [3]: %timeit elem.polynom_b
8.09 µs ± 68.3 ns per loop (mean ± std. dev. of 7 runs, 100,000 loops each)
In [4]: %timeit elem.polynom_b = [1,2,3,4,5,6]
1.52 µs ± 37.1 ns per loop (mean ± std. dev. of 7 runs, 1,000,000 loops each)
In [5]: %timeit elem.polynom_b[0] = 3.14
9.96 µs ± 358 ns per loop (mean ± std. dev. of 7 runs, 100,000 loops each)
The modified packages:
In [3]: %timeit elem.polynom_b
1.82 µs ± 180 ns per loop (mean ± std. dev. of 7 runs, 1,000,000 loops each)
In [4]: %timeit elem.polynom_b = [1,2,3,4,5,6]
1.43 µs ± 108 ns per loop (mean ± std. dev. of 7 runs, 1,000,000 loops each)
In [5]: %timeit elem.polynom_b[0] = 3.14
1.81 µs ± 10.9 ns per loop (mean ± std. dev. of 7 runs, 1,000,000 loops each)
As seen, the improvements with polynomials in lnls-fac/trackcpp#70 and "thisown" (lnls-fac/pyaccel#134) increase the speed in the access of the polynomials of the Elements, with non-copying numpy arrays of the std::vector
The image bellow contains the script for the benchmark no. 1.
The trackcpp must be updated (following lnls-fac/trackcpp#70 ) for the current changes to work.