cholla-hydro / cholla

A GPU-based hydro code
https://github.com/cholla-hydro/cholla/wiki
MIT License
60 stars 32 forks source link

Fixing `clang-analyzer-core.UndefinedBinaryOperatorResult` with templates yields surprising performance improvement #308

Open bcaddy opened 1 year ago

bcaddy commented 1 year ago

I was working on clang-tidy checks during the hack session today and ran into something interesting I think we should discuss. The clang-analyzer-core.UndefinedBinaryOperatorResult check checks for cases where a variable might be undefined when it's used and so structures like this tend to trigger it:

  int o1, o2, o3;
  switch (dir) {
    case 0:
      o1 = grid_enum::momentum_x;
      o2 = grid_enum::momentum_y;
      o3 = grid_enum::momentum_z;
      break;
    case 1:
      o1 = grid_enum::momentum_y;
      o2 = grid_enum::momentum_z;
      o3 = grid_enum::momentum_x;
      break;
    case 2:
      o1 = grid_enum::momentum_z;
      o2 = grid_enum::momentum_x;
      o3 = grid_enum::momentum_y;
      break;
  }

// the check flags the usage of o1 on this line
velocity_x = dev_conserved[o1 * n_cells + id];

It is technically correct to flag this since o1 could be uninitialized if dir is not 0-2 even though we know it always will be in that range. An easy way to fix this is if all the kernels with a structure like this take dir as a template argument instead of a regular argument, thus allowing the switch statement to be determined at compile time.

While I don't think it's worth doing that just for this check I tried it out on a single kernel (the new PPMC_VL kernel) since templates can improve the performance of control statements by evaluating them at compile time. While I didn't expect it to help it actually did help. It reduced the execution time by ~4ms per time step, a ~2% improvement. If these results hold when applied to other kernels that could be an easy 10% performance improvement for very little effort. What do you all think?

bcaddy commented 1 year ago

@evaneschneider, @alwinm, I'm especially interested in your takes on this