tud-zih-energy / FIRESTARTER

FIRESTARTER: A Processor Stress Test Utility. This repository contains the source code generator. Our releases (including the generated source code and precompiled binaries) are available at https://tu-dresden.de/zih/firestarter/.
GNU General Public License v3.0
115 stars 25 forks source link

FLOPs and Bandwidth report for GPUs #50

Open amir-raoofy opened 1 year ago

amir-raoofy commented 1 year ago

I noticed that running FIRESTARTER on GPUs (i.e., Nvidia GPUs to be precise), with "-r" flag only reports estimated the CPU FLOPs and Bandwidth . Is there a way to get estimated FLOPs and Bandwidth on GPU? if not is there a plan to support this?

Another point is that the log does not explicitly mention that the reported flops and bandwidth are related to CPU and I find it a bit confusing to get this statistic without associating them with, e.g., CPU or GPU. Especially once one would build FIRESTARTER for GPU.

rschoene commented 3 months ago

I would check for the FLOPS values on the GPUs. However, we cannot really make an assumption on the bandwidth for GPUs, since the implementation of gemm allows for a lot of optimizations. Would that be sufficient or would that lead to more irritation (FLOPS for both, but bw only for host)?

amir-raoofy commented 2 months ago

Thanks. My main point is that when someone builds FIRESTARTER with GPU support (e.g., with CUDA), there is a good chance they are after the numbers associated with the GPU. However, the benchmark only reports values without mentioning that the numbers are associated with the CPU and not the GPU, which could be misleading.

GPU estimates (FLOPS and/or BW rates) would be good to have, but I think, even if getting estimates is not possible for any reason, it is still fine as long as the confusion about what these numbers are associated with is addressed.

rschoene commented 2 months ago

I'm at it in branch gpu_flops

rschoene commented 2 months ago

(comments and remarks are welcome)

rschoene commented 2 months ago

@amir-raoofy should work now, please provide feedback