npadmana / DistributedFFT

6 stars 2 forks source link

Why are UPC MFlops lower although the code runs faster? #41

Closed npadmana closed 4 years ago

npadmana commented 5 years ago

33 shows that the UPC benchmark runs faster than our code, however, the MFlops reported is lower. We should make sure that all the codes compute MFlops correctly.

The UPC code is

    mflops = (double)(1.0e-6*((double)NX*NY*NZ) *
                      (14.8157 + 7.19641*log((double)NX*NY*NZ)
                       + (5.23518 + 7.21113*log((double)NX*NY*NZ))*MAX_ITER));
    mflop_rate = mflops/(((double)timer_val(T_TOTAL))/1e6);
    printf("Total running time is %f s\n", ((double)timer_val(T_TOTAL))/1e6);
    printf("NAS FT (BACKEND: %s) (UPC: %s) CLASS (%c) on %d processors TY: %d TZ: %d Mflops: %g MFlop/s MFlop/s/Thread: %g\n", 
            myfft_id_str(idbuffer), fft3d_id_str(idbuffer2), class_id_char, 
            THREADS, TY, TZ, mflop_rate, mflop_rate/THREADS);

Here is the Chapel version

const total_time = timeit.elapsed();
const ntotal_f = (Nx*Ny*Nz):real;
const niter = refChecksums.size;
const mflops = 1.0e-6*ntotal_f *
  (14.8157+7.19641*log(ntotal_f)
   +  (5.23518+7.21113*log(ntotal_f))*niter)
  /total_time;
writef("Elapsed time : %10.4dr\n",timeit.elapsed());
writef("MFLOPS : %10.4dr\n",mflops);
npadmana commented 5 years ago

I believe these are actually the same (the issue was my confusion on what the UPC benchmarks were called), but it would be worth a quick second confirmation. We can then close this....

npadmana commented 5 years ago

@ronawho -- do you mind checking that we're computing the same way?

ronawho commented 5 years ago

These look the same to me, but as an extra sanity check I'd take the 3 versions (Chapel, UPC, and MPI) and hardcode the total time to a few different values and make sure the MFlop/s is the same.

ronawho commented 4 years ago

Based on the runs in https://github.com/npadmana/DistributedFFT/pull/57 and some sanity checks I did with manually setting the total time, I'm pretty confident that we're reporting the same thing as UPC/MPI references.

I think this can be closed.

npadmana commented 4 years ago

Agreed.