MFlowCode / MFC

Exascale simulation of multiphase/physics fluid dynamics
https://mflowcode.github.io
MIT License
132 stars 56 forks source link

Fix Benchmarking #423

Closed wilfonba closed 1 month ago

wilfonba commented 1 month ago

Description

This PR fixes several issues related to benchmarking. It fixes the cases such that all cases run, it records runtime using floating point numbers, and it now fails gracefully if any run time is non-positive.

Fixes #419 #415 #394 #393

Type of change

Scope

If you cannot check the above box, please split your PR into multiple PRs that each have a common goal.

How Has This Been Tested?

Ran the benchmarking suite on Wingtip two times with ./mfc.sh bench -o asdf[hjkl] and performed a ./mfc.sh bench asdf hjkl to verify the comparison. I also intentionally added a zero runtime to each output individually to ensure that the graceful failure performs as expected.

Checklist

sbryngelson commented 1 month ago

Still showing

 Comparing Bencharks: master/bench-cpu.yaml is x times slower than pr/bench-cpu.yaml.

  Case                     Pre Process   Simulation   Post Process  
 ────────────────────────────────────────────────────────────────── 
  viscous_weno5_sgb_mono         0.69x          N/A            N/A  
  hypo_hll                       0.08x          N/A            N/A  
  ibm                            1.29x          N/A            N/A  
  5eq_rk3_weno3_hllc             1.09x          N/A            N/A                                                                

in CI (CPU).

sbryngelson commented 1 month ago

logs look good to me. we won't know that bench diff is working perfectly until merge i suppose

wilfonba commented 1 month ago

I realized there was a small inaccuracy in how I was handling missing targets. That's been fixed and this should be ready now.

henryleberre commented 1 month ago

@sbryngelson It looks like the Frontier CI is failing here because of an issue I am fixing in #422. I think we can ignore this failure. Conversely, Benchmarking CI also fails there.

henryleberre commented 1 month ago

Also, I do not understand why #422 invokes ./mfc.sh bench_diff using the code from this PR: https://github.com/MFlowCode/MFC/actions/runs/9141197606/job/25135331754.

sbryngelson commented 1 month ago

Not sure I understand the status of this PR

wilfonba commented 1 month ago

It's weird that @henryleberre PR used the bench_diff from this PR. I think if you rerun the Frontier test suite and it passes everything should be good to merge.

henryleberre commented 1 month ago

@wilfonba If you rebase on master, it should pass Frontier CI.

sbryngelson commented 1 month ago

@wilfonba If you rebase on master, it should pass Frontier CI.

@henryleberre I think when you posted this it had already passed Fronter CI 🤔

sbryngelson commented 1 month ago

Merging pending a small comment from @henryleberre:

FYI there is a typo in the print for the if target.name not in rhs_summary: condition.

All tests pass except benchmarking (naturally and a good sign).

The next step is creating a “dummy” PR that introduces a performance degradation or a NaN and marking sure benchmarking works as expected (indicating either slowdown or that the case didn’t run).