LLNL / axom

CS infrastructure components for HPC applications
BSD 3-Clause "New" or "Revised" License
157 stars 27 forks source link

Add helper method for efficiently combining logical operations #582

Open publixsubfan opened 3 years ago

publixsubfan commented 3 years ago

In #577, it was discovered that using bitwise AND to combine the results of single-dimension checks was considerably faster than using logical AND on the GPU:

bool status = true;
for (int idim  = 0; idim < NDIMS; idim++)
{
    // generates predicated branches: slower on GPU, faster on CPU
    status = status && detail::intersect(...);
    // faster on GPU
    status = status & detail::intersect(...);
}

Conversely, the logical AND seems to be faster than the bitwise AND on the CPU.

We should create a helper macro/method that can pick between bitwise and logical AND depending on whether the code is being compiled for the CPU or the GPU.

kennyweiss commented 2 years ago

There are open questions about how to best deal with this.

kennyweiss commented 2 years ago

Perhaps add this to a "benchmarks" suite to track performance across platforms?

kennyweiss commented 3 months ago

See changes from #1171 Specifically https://github.com/LLNL/axom/pull/1171/files#diff-14a178f0280c77d0aaf9bb588a55616dcf7fd84651c3bb7584bd524dd70f3ae5