lnls-fac / pyaccel

Python module for beam dynamics tracking and optics calculations
MIT License
7 stars 5 forks source link

rework: non-copying polynomials #134

Closed VitorSouzaLNLS closed 5 months ago

VitorSouzaLNLS commented 6 months ago

The trackcpp must be updated (following lnls-fac/trackcpp#70 ) for the current changes to work.

VitorSouzaLNLS commented 6 months ago

I've made some benchmarks to test the optimisation. The benchmarks consists in two different tests: 1. a tracking comparison linked with and without rotation errors in dipoles (that modifies the polynomials A and B), and 2. a "get and set" direct access comparisson.

The tests were runned in separated conda envs: sirius and static-sirius. The sirius env has the pyaccel @ lnls-fac/pyaccel#134 and the trackcpp @ lnls-fac/trackcpp#70. The static-sirius keeps the pyaccel and trackcpp at master.

  1. Tracking comparison. The script can be found in the image in the bottom of this comment.

Master packages:

seed = 123
time to set R errors: 0.0
time to track 1000 turns: 2.1539626121520996
time to remove R errors: 0.0
particle out:
        +5.090343983979611584551709657642248885167646221816539764404297e-06
        -2.672436247254870412272060914340293669155812494864221662282944e-09
        -1.383764633161988879263114152542990531458144687348976731300354e-07
        -7.187308913126152246043584770340828526968834921717643737792969e-08
        +4.351952359451681170976478085776761872693896293640136718750000e-03
        -1.914367496884084723918206805137742776423692703247070312500000e-02

time to set R errors: 0.11367011070251465
time to track 1000 turns: 2.1486620903015137
time to remove R errors: 0.11101984977722168
particle out:
        +3.240388223648277683753768374508297256397781893610954284667969e-05
        -4.024638141951858937277880512439764970622491091489791870117188e-06
        +1.105937292938897808852391491107880483468761667609214782714844e-04
        -3.184947121240178660945066568821459895843872800469398498535156e-05
        +4.351546717365082674044973742866204702295362949371337890625000e-03
        -1.915489956647268426914720862441754434257745742797851562500000e-02

The modified packages:

seed = 123
time to set R errors: 0.0
time to track 1000 turns: 2.1565747261047363
time to remove R errors: 0.0
particle out:
        +5.090343983979611584551709657642248885167646221816539764404297e-06
        -2.672436247254870412272060914340293669155812494864221662282944e-09
        -1.383764633161988879263114152542990531458144687348976731300354e-07
        -7.187308913126152246043584770340828526968834921717643737792969e-08
        +4.351952359451681170976478085776761872693896293640136718750000e-03
        -1.914367496884084723918206805137742776423692703247070312500000e-02

time to set R errors: 0.050624847412109375
time to track 1000 turns: 2.1532840728759766
time to remove R errors: 0.04798269271850586
particle out:
        +3.240388223648277683753768374508297256397781893610954284667969e-05
        -4.024638141951858937277880512439764970622491091489791870117188e-06
        +1.105937292938897808852391491107880483468761667609214782714844e-04
        -3.184947121240178660945066568821459895843872800469398498535156e-05
        +4.351546717365082674044973742866204702295362949371337890625000e-03
        -1.915489956647268426914720862441754434257745742797851562500000e-02

As seen, the "tracking" itself was preserved (performance & tracked result), while the performance in setting/adding and removing rotation errors (that requires access to polynom_a and polynom_b ~ for dipoles) had been speed-up by 2x.

  1. Direct access. For both environments, the folowing lines were the same:
    (env) $ipython
    In [1]: import pyaccel
    In [2]: elem = pyaccel.elements.Element()

Master packages:

In [3]: %timeit elem.polynom_b
8.09 µs ± 68.3 ns per loop (mean ± std. dev. of 7 runs, 100,000 loops each)

In [4]: %timeit elem.polynom_b = [1,2,3,4,5,6]
1.52 µs ± 37.1 ns per loop (mean ± std. dev. of 7 runs, 1,000,000 loops each)

In [5]: %timeit elem.polynom_b[0] = 3.14
9.96 µs ± 358 ns per loop (mean ± std. dev. of 7 runs, 100,000 loops each)

The modified packages:

In [3]: %timeit elem.polynom_b
1.82 µs ± 180 ns per loop (mean ± std. dev. of 7 runs, 1,000,000 loops each)

In [4]: %timeit elem.polynom_b = [1,2,3,4,5,6]
1.43 µs ± 108 ns per loop (mean ± std. dev. of 7 runs, 1,000,000 loops each)

In [5]: %timeit elem.polynom_b[0] = 3.14
1.81 µs ± 10.9 ns per loop (mean ± std. dev. of 7 runs, 1,000,000 loops each)

As seen, the improvements with polynomials in lnls-fac/trackcpp#70 and "thisown" (lnls-fac/pyaccel#134) increase the speed in the access of the polynomials of the Elements, with non-copying numpy arrays of the std::vector polynomials from trackcpp to pyaccel.

The image bellow contains the script for the benchmark no. 1.

image