cms-sw / cmssw

CMS Offline Software
http://cms-sw.github.io/
Apache License 2.0
1.07k stars 4.28k forks source link

Follow up to the Alpaka Implementation of PFClusterProducer #43501

Open fwyzard opened 9 months ago

fwyzard commented 9 months ago

The goal of this issue is to collect feedback and action items on the Alpaka Implementation of PFClusterProducer, to be followed up after the integration of #43130.

elements_in_block_with_stride

We should implement a helper function elements_in_block_with_stride(acc, extent) to make all the threads in a group loop and cover extent elements, with a stride equal to the group size.

Then, update the loop in TopoClusterContraction to use elements_in_block_with_stride instead of

      for (int rhIdx = alpaka::getIdx<alpaka::Block, alpaka::Threads>(acc)[0u]; rhIdx < nRH;
           rhIdx += alpaka::getWorkDiv<alpaka::Block, alpaka::Threads>(acc)[0u]) {

single precision literals

The use of double precision literals (e.g. 0.5 in expf(-0.5 * value)) force the compiler to convert the operands from single to double precision, compute the result in double precision, and convert them to back single precision. Given the cost of double precision operations on the "small" GPUs like NVIDIA T4 or Intel Flex, these conversions and temporary operations in double precision should be avoided, by explicitly marking the floating point literals as single precision:

expf(-0.5f * value);

avoid square roots where possible

Given

float cut = ...;
float value2 = ...;
float value = sqrtf(value2);
if (value > cut) { ...}

we should avoid the square root and use the square of cut:

float cut = ...;
float value2 = ...;
float cut2 = cut * cut;
if (value2 > cut2) { ...}
cmsbuild commented 9 months ago

A new Issue was created by @fwyzard Andrea Bocci.

@Dr15Jones, @makortel, @sextonkennedy, @rappoccio, @antoniovilela, @smuzaffar can you please review it and eventually sign/assign? Thanks.

cms-bot commands are listed here

fwyzard commented 9 months ago

type pf

fwyzard commented 9 months ago

assign heterogeneous

cmsbuild commented 9 months ago

New categories assigned: heterogeneous

@fwyzard,@makortel you have been requested to review this Pull request/Issue and eventually sign? Thanks

jsamudio commented 9 months ago

I opened #43574 to address reading the thresholds from GT. Is there any changes in particular you would want bundled in that particular PR?

fwyzard commented 9 months ago

I think you have already address the second and third point (using single precision literals and avoid square roots) - if not, you could do that.

For the first point we still need to provide the helper function.

jsamudio commented 9 months ago

I believe avoiding the square roots was implemented in the original PR, and I just updated expf functions in the latest commit for #43574. So probably at this point everything but the first point is addressed.