E3SM-Project / HOMMEXX

Clone of ACME for CMDV-SE project to convert HOMME to C++
11 stars 0 forks source link

Use Kokkos::parallel_scan in PPM Vertical Remap instead of serial scans #280

Closed mfdeakin-sandia closed 6 years ago

mfdeakin-sandia commented 6 years ago

This isn't quite done; it still needs to switch between the serial version and the parallel version when comparing against baselines/the Fortran. Performance on the P100 with nmax = 100, ne = 8, qsize = 40 looks great though:

HommeTime_stats.no_scan.1
prim_main_loop                     6.073
tl-ae advance_hypervis_dp          0.710
hvf-bhwk                           0.504
hvf-bexch                          0.187
tl-at prim_advec_tracers_remap_RK2 4.067
tl-sc vertical_remap               0.377
Remap Thickness Functor            0.008
Remap Scale States Functor         0.003
Remap Compute Grids Functor        0.016
Remap Compute Remap Functor        0.338
Remap Rescale States Functor       0.003

HommeTime_stats.no_scan.2
prim_main_loop                     6.084
tl-ae advance_hypervis_dp          0.709
hvf-bhwk                           0.503
hvf-bexch                          0.187
tl-at prim_advec_tracers_remap_RK2 4.063
tl-sc vertical_remap               0.377
Remap Thickness Functor            0.008
Remap Scale States Functor         0.003
Remap Compute Grids Functor        0.016
Remap Compute Remap Functor        0.338
Remap Rescale States Functor       0.003

HommeTime_stats.no_scan.3
prim_main_loop                     6.100
tl-ae advance_hypervis_dp          0.709
hvf-bhwk                           0.503
hvf-bexch                          0.187
tl-at prim_advec_tracers_remap_RK2 4.066
tl-sc vertical_remap               0.378
Remap Thickness Functor            0.008
Remap Scale States Functor         0.003
Remap Compute Grids Functor        0.016
Remap Compute Remap Functor        0.338
Remap Rescale States Functor       0.003

HommeTime_stats.no_scan.4
prim_main_loop                     6.095
tl-ae advance_hypervis_dp          0.709
hvf-bhwk                           0.503
hvf-bexch                          0.187
tl-at prim_advec_tracers_remap_RK2 4.064
tl-sc vertical_remap               0.377
Remap Thickness Functor            0.008
Remap Scale States Functor         0.003
Remap Compute Grids Functor        0.016
Remap Compute Remap Functor        0.338
Remap Rescale States Functor       0.003
HommeTime_stats.vr_scan.1
prim_main_loop                     5.983
tl-ae advance_hypervis_dp          0.710
hvf-bhwk                           0.504
hvf-bexch                          0.188
tl-at prim_advec_tracers_remap_RK2 4.069
tl-sc vertical_remap               0.256
Remap Thickness Functor            0.008
Remap Scale States Functor         0.003
Remap Compute Grids Functor        0.008
Remap Compute Remap Functor        0.224
Remap Rescale States Functor       0.003

HommeTime_stats.vr_scan.2
prim_main_loop                     5.972
tl-ae advance_hypervis_dp          0.709
hvf-bhwk                           0.503
hvf-bexch                          0.187
tl-at prim_advec_tracers_remap_RK2 4.065
tl-sc vertical_remap               0.255
Remap Thickness Functor            0.008
Remap Scale States Functor         0.003
Remap Compute Grids Functor        0.008
Remap Compute Remap Functor        0.224
Remap Rescale States Functor       0.003

HommeTime_stats.vr_scan.3
prim_main_loop                     5.965
tl-ae advance_hypervis_dp          0.709
hvf-bhwk                           0.503
hvf-bexch                          0.187
tl-at prim_advec_tracers_remap_RK2 4.065
tl-sc vertical_remap               0.255
Remap Thickness Functor            0.008
Remap Scale States Functor         0.003
Remap Compute Grids Functor        0.008
Remap Compute Remap Functor        0.224
Remap Rescale States Functor       0.003

HommeTime_stats.vr_scan.4
prim_main_loop                     5.974
tl-ae advance_hypervis_dp          0.709
hvf-bhwk                           0.503
hvf-bexch                          0.187
tl-at prim_advec_tracers_remap_RK2 4.065
tl-sc vertical_remap               0.255
Remap Thickness Functor            0.008
Remap Scale States Functor         0.003
Remap Compute Grids Functor        0.008
Remap Compute Remap Functor        0.224
Remap Rescale States Functor       0.003

Next up will be CAAR

ambrad commented 6 years ago

In this or another PR it would be good to put || scan into CAAR.