Open inducer opened 9 years ago
So for now we should be counting this as two flops, but the issue of expressing this in loopy and it's flop counter is an important one.
Sent from my iPhone
On May 22, 2015, at 11:22 AM, Andreas Klöckner notifications@github.com wrote:
@jdsteve2 @rckirby
One issue with FLOP is whether we count a*x + b as one or two operations. This is called FMA. It's often faster and it has better accuracy than the two operations carried out separately. But exactly since it's accuracy is different, compilers generally won't compile them as an FMA unless you specify -cl-fast-relaxed-math. There is also a fma function in OpenCL that allows you to explicitly ask for an FMA on a per-operation basis.
Since there are enough moving parts and since loopy doesn't yet take do anything to manage FMAs itself, I think we should have a knob on whether FMAs should count as one or two flops in the flop counter.
— Reply to this email directly or view it on GitHub.
@jdsteve2 Can you remind me of what the status is here?
Not yet implemented, on my todo list
Thx!
@jdsteve2 @rckirby
One issue with FLOP is whether we count
a*x + b
as one or two operations. This is called FMA. It's often faster and it has better accuracy than the two operations carried out separately. But exactly since it's accuracy is different, compilers generally won't compile them as an FMA unless you specify-cl-fast-relaxed-math
. There is also afma
function in OpenCL that allows you to explicitly ask for an FMA on a per-operation basis.Since there are enough moving parts and since loopy doesn't yet take do anything to manage FMAs itself, I think we should have a knob on whether FMAs should count as one or two flops in the flop counter.