MFlowCode / MFC

Exascale simulation of multiphase/physics fluid dynamics
https://mflowcode.github.io
MIT License
137 stars 60 forks source link

[website] Expected performance results on website are mostly wrong #520

Closed sbryngelson closed 3 weeks ago

sbryngelson commented 1 month ago

https://mflowcode.github.io/documentation/md_expectedPerformance.html

Most of these numbers are incorrect. It's unclear where things went wrong. @wilfonba and I already confirmed that the A100 numbers are incorrect.

One comment is that this page should have an example of how exactly to run the performance test locally. For example, the command ./mfc.sh run -n 8 -j 8 ./examples/3D_performance_test/case.py --case-optimization -t pre_process simulation or some such for CPU and the addition of --gpu for GPU cases.

I ran the 3D_performance_test example with 4M and 8M grid points on my M1 Max on 8 Cores, gfortran 14.1.0 and got:

which is a factor of 5x faster than what's on the website for the M2 chip. I know the M1 Max is probably faster than the M2 for this workload, but not 5x faster. Again, @wilfonba replicated this problem on NV A100s as well. These results should all be updated.

We can remove Summit performance results instead of generic V100 test results. We also don't to have 1, 4, and 8M grid point cases. The numbers are so similar regardless. I think we should just converge on 8M grid points (200^3 simulation) for all performance tests, which is big enough to be meaningful but not too big to overwhelm the memory of any real device.

Open to other suggestions!

sbryngelson commented 1 month ago

I'm gathering some more info, all using 8M grid points. This is everything I have. I didn't run a test on Frontier, but we should also update that number.

Intel Xeon Gold 6226 CPU (Cascade Lake) @ 2.70GHz (on Phoenix), 12 core CPU, best performance using 12 cores, Intel oneAPI 2022.1.0

AMD EPYC 7713 (Milan) 64-Core CPU, best performance using 32 cores. gcc12.1.0

M1 Max, 8 Cores. gcc14.1

RTX6000 (single-precision GPU upconverting to DP in software) @ Phoenix, NVHPC 22.11

A40 (single-precision GPU upconverting to DP in software) @ NCSA Delta, NVHPC 22.11

MI250X 1 GCD, CCE16.0.1

A30 @ RG, NVHPC 24.1

V100-32GB @ Phoenix, NVHPC 24.5

A100-80GB @ Phoenix, NVHPC 22.11

H100 80GB PCIe @ Rogues Gallery, NVHPC 24.5

GH200 @ Rogues Gallery, NVHPC 24.1, (only the GPU is used)

sbryngelson commented 3 weeks ago

I want to add A40 and RTX____ to this list (single precision GPU that will convert in software to DP)

Update: Added. Want to add MI100 and MI210 if possible. working on it.