paudetseis / RfPy

Teleseismic receiver function calculation and post-processing
https://paudetseis.github.io/RfPy/
MIT License
39 stars 28 forks source link

Optimize harmonic processing speed with numba #36

Open anangsahroni opened 2 years ago

anangsahroni commented 2 years ago

Hi! Thank you very much for the code, I use it heavily on my thesis.

As mentioned in #7 which one of the todo list items is to convert the code to low level language to increase the performance, here I provide a numba-optimized code for rfpy.harmonics. numba translates Python functions to optimized machine code at runtime, it can be an alternative or temporary solution to performance optimization without rewriting the code to lower level language.

I optimized rfpy_harmonics and created a simple performance comparison by running the script for station MMPY for 9 different scenarios depending on the number of receiver functions (nrf), each of the scenario is executed based on two cases: with and without numba.

The device and environment for testing:

ROG Zephyrus G14 GA401QH_GA401QH AMD Ryzen 7 5800HS with Radeon Graphics (16 CPUs), ~3.2GHz 8192MB RAM Ubuntu 18.04 WSL2, Windows 11 Python 3.8.13

The input parameters and the result:

Table:

station start end trange nrf time_without_numba time_with_numba
MMPY 2017-01-01 2018-03-30 2-10 137 691.202392578125 47.04966640472412
MMPY 2017-01-01 2017-11-06 2-10 120 630.3968324661255 42.97214984893799
MMPY 2017-01-01 2017-09-23 2-10 100 537.779358625412 33.96166014671326
MMPY 2017-01-01 2017-08-10 2-10 80 420.6219515800476 25.681000471115112
MMPY 2017-01-01 2017-07-14 2-10 60 314.8274037837982 21.694521188735962
MMPY 2017-01-01 2017-06-15 2-10 41 91.12138342857361 8.571951150894165
MMPY 2017-01-01 2017-05-15 2-10 30 68.21855211257935 6.957925796508789
MMPY 2017-01-01 2017-04-09 2-10 21 52.25753951072693 5.868242502212524
MMPY 2017-01-01 2017-02-23 2-10 11 32.92144703865051 4.929212331771851

Graph:

test_numba_result

As shown on the table and graph, numba can significantly increase the performance.

Changes

  1. Added _dcomp_calculate_harmonics_isolated static method to isolate harmonics calculation based on numba, obspy.Stream object in this function is converted to numpy.array to accomodate numba
  2. Added use_numba as argument for rfpy.harmonics.Harmonic.dcomp_find_azim, when this argument is True then the processing will be executed by numba
  3. Added progressbar when processing is running using tqdm and numba_progressbar
  4. Added --use-numba argument for rfpy_harmonics
  5. Added numba and numba_progressbar in setup.py
  6. Added documentation in rfpy.rst and tutorial.rst

Notes

  1. Because of the current behavior of numba as mentioned here, canceling processing with CTRL+C is currently not supported. Actually I have managed to successfully solve this problem on branch harmonics-numba-sigint but it requires GCC (Linux) or MSVC (Windows) to work.
  2. Sometimes the returned azimuth can be 180 degrees shifted compared to no numba because of the floating point behavior of numba as mentioned here and here. For example, when running the first scenario on the table (137 nrf), the variances of C1 for index 83 and 173, corresponding to azimuth 166 and 346 respectively:
    • with numba: 0.0008908156080249394 and 0.0008908156080249397
    • without numba: 0.0008908156080249395 and 0.0008908156080249395
  3. numba will be a little bit slower in the first run because of compilation

Screenshots

linux_ss windows_ss