ITA-Solar / helita

A Python library for solar physics from the Institute of Theoretical Astrophysics, University of Oslo
BSD 3-Clause "New" or "Revised" License
9 stars 17 forks source link

Get rid of cstagger, use python module with numba #31

Closed tiagopereira closed 3 years ago

tiagopereira commented 3 years ago

This will get rid of all cstagger machinery, cython compilation, etc. and enable simpler functions for the stagger up/down operations.

tiagopereira commented 3 years ago

Interesting you point out about the speed. In my first tests, it was faster.

Here are some benchmarks run on eagle4, for two simulations: one 720x720x1116 and the other 504x504x496:

from helita.sim import bifrost, cstagger, stagger
bb = bifrost.BifrostData("nw072100", snap=525, fdir=".")
bb.nx, bb.ny, bb.nz
(720, 720, 1116)

# Old cstagger:
%timeit cstagger.do(bb.px, 'xup')
1.49 s ± 1.81 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)
%timeit  cstagger.do(bb.py, 'yup')
1.59 s ± 64.5 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)
%timeit cstagger.do(bb.pz, 'zup')
3.4 s ± 18.2 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)

# New stagger:
%timeit stagger.do(bb.px, 'xup')
1.58 s ± 76.6 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)
%timeit stagger.do(bb.py, 'yup')
1.54 s ± 66.2 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)
%timeit stagger.do(bb.pz, 'zup')
2.4 s ± 228 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)

For a 504x504x496 sim:

bb = bifrost.BifrostData("cb24bih", snap=525, fdir=".")
bb.nx, bb.ny, bb.nz
(504, 504, 496)

# Old cstagger:
%timeit -n 5 cstagger.do(bb.px, 'xup')
255 ms ± 69.5 ms per loop (mean ± std. dev. of 7 runs, 5 loops each)
%timeit -n 5 cstagger.do(bb.py, 'yup')
181 ms ± 993 µs per loop (mean ± std. dev. of 7 runs, 5 loops each)
%timeit -n 5 cstagger.do(bb.pz, 'zup')
602 ms ± 58.8 ms per loop (mean ± std. dev. of 7 runs, 5 loops each)

# New stagger:
%timeit -n 5 stagger.do(bb.px, 'xup')
373 ms ± 17.9 ms per loop (mean ± std. dev. of 7 runs, 5 loops each)
%timeit -n 5 stagger.do(bb.py, 'yup')
332 ms ± 18.8 ms per loop (mean ± std. dev. of 7 runs, 5 loops each)
%timeit -n 5 stagger.do(bb.pz, 'zup')
500 ms ± 25.2 ms per loop (mean ± std. dev. of 7 runs, 5 loops each)

For the up operations, the new stagger is a bit slower for the x and y dimensions, but a bit faster for the z direction. However, it seems that cstagger is slower for the down operations, while stagger takes about the same, so they are more comparable there.

tiagopereira commented 3 years ago

In some cases for a smaller (256, 256, 512) simulation the stagger version got speedups of about 3x for some of the operations, but this doesn't seem to happen for all machines. In any case, I am not seeing it much slower, so maybe the fault lies somewhere else?

M1kol4j commented 3 years ago

I am start thinking that my entire python is slow now, but since I am using more and more Julia now I can simply re-install anaconda and check again. Results you have shown looks awesome. You should not wait with pull request, this speed with and fix for double precision are great update to current version and I will switch to it as soon as possible.

On May 4 2021, at 4:41 pm, Tiago M. D. Pereira @.***> wrote:

In some cases for a smaller (256, 256, 512) simulation the stagger version got speedups of about 3x for some of the operations, but this doesn't seem to happen for all machines. In any case, I am not seeing it much slower, so maybe the fault lies somewhere else? — You are receiving this because your review was requested. Reply to this email directly, view it on GitHub (https://github.com/ITA-Solar/helita/pull/31#issuecomment-831995506), or unsubscribe (https://github.com/notifications/unsubscribe-auth/ACAVX37NYKBRATGNKJ6BLHTTMABTPANCNFSM434CYEVA).

tiagopereira commented 3 years ago

Sorry, this was not ready to be merged yet (need to remove the cstagger part).