tardis-sn / tardis

TARDIS - Temperature And Radiative Diffusion In Supernovae
https://tardis-sn.github.io/tardis
204 stars 406 forks source link

Run the loop in parallel #2769

Closed Sumit112192 closed 3 months ago

Sumit112192 commented 3 months ago

:pencil: Description

Type: :roller_coaster: infrastructure

Run the loop in montecarlo_main_loop related to finalize_array in parallel.

tardis-bot commented 3 months ago

*beep* *bop* Hi human, I ran ruff on the latest commit (1a47b9bea931c7e7f8b3cfcb2028c58f8e13759c). Here are the outputs produced. Results can also be downloaded as artifacts here. Summarised output:

```diff ```

Complete output(might be large):

```diff ```
Sumit112192 commented 3 months ago

@andrewfullard Are there any disadvantages to running the parallel loop that I made?

codecov[bot] commented 3 months ago

Codecov Report

Attention: Patch coverage is 0% with 2 lines in your changes missing coverage. Please review.

Project coverage is 69.25%. Comparing base (7231707) to head (1a47b9b). Report is 4 commits behind head on master.

Files Patch % Lines
...ardis/transport/montecarlo/montecarlo_main_loop.py 0.00% 2 Missing :warning:
Additional details and impacted files ```diff @@ Coverage Diff @@ ## master #2769 +/- ## ========================================== - Coverage 69.86% 69.25% -0.61% ========================================== Files 192 196 +4 Lines 15048 15002 -46 ========================================== - Hits 10513 10390 -123 - Misses 4535 4612 +77 ```

:umbrella: View full report in Codecov by Sentry.
:loudspeaker: Have feedback on the report? Share it here.

tardis-bot commented 3 months ago

*beep* *bop*

Hi, human.

The docs workflow has succeeded :heavy_check_mark:

Click here to see your results.

Sumit112192 commented 3 months ago

Benchmark the change to check if threading overloads shadows the parallel execution to increase run_time.

Sumit112192 commented 3 months ago

Without Parallel

from numba import njit, prange
from numba.typed import List
import numpy as np

from tardis.transport.montecarlo.packet_trackers import RPacketTracker
@njit()
def aNumbaFuncWithoutParallel(no_of_packets):
    length = 100
    rpacket_trackers = List()
    for i in range(no_of_packets):
        rpacket_trackers.append(RPacketTracker(length))

    for i in range(no_of_packets):
        random_num_interaction = np.random.randint(2, length)
        rpacket_trackers[i].num_interactions = random_num_interaction
    for rpacket_tracker in rpacket_trackers:
        rpacket_tracker.finalize_array()
%timeit aNumbaFuncWithoutParallel(40000)
308 ms ± 6.33 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)
%timeit aNumbaFuncWithoutParallel(100000)
795 ms ± 56.9 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)
%timeit aNumbaFuncWithoutParallel(200000)
1.52 s ± 70.6 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)
%timeit aNumbaFuncWithoutParallel(400000)
3.04 s ± 88.5 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)
Sumit112192 commented 3 months ago

With Parallel

from numba import njit, prange
from numba.typed import List
import numpy as np

from tardis.transport.montecarlo.packet_trackers import RPacketTracker
@njit(parallel=True)
def aNumbaFuncWithParallel(no_of_packets):
    length = 100
    rpacket_tracker = List()
    for i in range(no_of_packets):
        rpacket_tracker.append(RPacketTracker(length))

    for i in range(no_of_packets):
        random_num_interaction = np.random.randint(1, length)
        rpacket_tracker[i].num_interactions = random_num_interaction
    for i in prange(no_of_packets):
        rpacket_tracker[i].finalize_array()
%timeit aNumbaFuncWithParallel(40000)
/tmp/ipykernel_34215/2781711506.py:12: NumbaTypeSafetyWarning: unsafe cast from uint64 to int64. Precision may be lost.
  rpacket_tracker[i].finalize_array()

346 ms ± 34.1 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)
%timeit aNumbaFuncWithParallel(100000)
783 ms ± 63.5 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)
%timeit aNumbaFuncWithParallel(200000)
1.51 s ± 35.4 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)
%timeit aNumbaFuncWithParallel(400000)
3.01 s ± 64.1 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)
Sumit112192 commented 3 months ago

Since the runtime is nearly the same even with parallel execution, I am closing this PR.