How do I match the results of profiling with the parameters of the cost model?

The output of profile bandwidth is as follows： size: 0.25 MB, gpu-to-cpu bandwidth: 5.505 GB/s size: 32.00 MB, gpu-to-cpu bandwidth: 13.220 GB/s size: 128.00 MB, gpu-to-cpu bandwidth: 13.324 GB/s

size: 0.25 MB, cpu-to-gpu bandwidth: 4.556 GB/s size: 32.00 MB, cpu-to-gpu bandwidth: 12.285 GB/s size: 128.00 MB, cpu-to-gpu bandwidth: 12.251 GB/s

Which is ctog_bdw, which is gtoc_bdw_cache, which is gtoc_bdw_hidden？

The output of profile matmul is as follows： device: cuda, N: 1024, latency: 0.06 ms, TFLOPS: 68.186 device: cuda, N: 2048, latency: 0.20 ms, TFLOPS: 97.026

device: cpu, N: 1024, latency: 0.89 ms, TFLOPS: 3.488 device: cpu, N: 2048, latency: 8.44 ms, TFLOPS: 2.924

which is mm_flops_p, mm_flops_g, bmm_flops_p,bmm_flops_g and cpu_flops? Thanks

FMInference / FlexLLMGen

How do I match the results of profiling with the parameters of the cost model? #131