AMReX-Astro / Microphysics

common astrophysical microphysics routines with interfaces for the different AMReX codes
https://amrex-astro.github.io/Microphysics
Other
34 stars 33 forks source link

Use autodiff in screening routines #1588

Closed yut23 closed 2 months ago

yut23 commented 2 months ago

This PR gives the same results for test_screening to within roundoff, and the performance is about the same or slightly better. It also adds a page to the docs about how to use the autodiff library.

One notable change is that the templated networks don't calculate the derivative terms when screening is called from RHS::rhs() (this is also how the pynucastro networks behave). Previously, they would be calculated unnecessarily if integrator.jacobian was set to 1.

yut23 commented 2 months ago

Performance comparisons, using test_screening_templated with aprox21, best of 10 runs CPU, with derivatives (n_cell=128, loops=1):

- screen5:        3.43s ->  3.37s (-0.05s,  -2%)
- chugunov2007:   9.47s ->  9.34s (-0.13s,  -1%)
- chugunov2009:  18.53s -> 16.58s (-1.95s, -11%)
- chabrier1998:   5.70s ->  5.15s (-0.55s, -10%)
  with q.c.:      6.49s ->  5.82s (-0.67s, -10%)

CPU, without derivatives (n_cell=128, loops=1):

- screen5:        2.71s ->  2.83s (+0.12s, +4%)
- chugunov2007:   8.47s ->  8.50s (+0.03s, +0%)
- chugunov2009:  16.41s -> 15.02s (-1.39s, -8%)
- chabrier1998:   4.43s ->  4.44s (+0.02s, +0%)
  with q.c.:      4.63s ->  4.59s (-0.04s, -1%)

CUDA, with derivatives (CUDA_LTO=TRUE, n_cell=64, loops=100):

- screen5:        1.70s ->  1.77s (+0.07s,  +4%)
- chugunov2007:   3.33s ->  2.62s (-0.71s, -21%)
- chugunov2009:   6.98s ->  7.84s (+0.86s, +12%)
- chabrier1998:   3.51s ->  3.00s (-0.51s, -15%)
  with q.c.:      4.11s ->  3.42s (-0.69s, -17%)

CUDA, without derivatives (CUDA_LTO=TRUE, n_cell=64, loops=100):

- screen5:        1.36s ->  1.38s (+0.02s, +1%)
- chugunov2007:   2.17s ->  2.15s (-0.02s, -1%)
- chugunov2009:   5.21s ->  5.21s (+0.00s, +0%)
- chabrier1998:   2.12s ->  2.12s (-0.00s, -0%)
  with q.c.:      2.34s ->  2.34s (+0.00s, +0%)
yut23 commented 2 months ago

Needs #1593

zingale commented 2 months ago

test suite run: http://groot.astro.sunysb.edu/Microphysics/test-suite/gfortran/2024-06-27-001/index.html

zingale commented 2 months ago

all the Jacobian diffs in the test suite are roundoff level, so this seems to be working well.