Open nizhenliang opened 5 years ago
Most of modern hardware architectures uses FMA instructions for operations with tensors. FMA computes ax+b as one operation. Roughly GMACs = 0.5 GFLOPs
Thank you, sir! Whether the output value is directly FLOPs? Do we need to divide it by 2 to get FLOPs?
what does MAC stands for? Multi-Add Calculation?
MAC = Multiply–accumulate operation
I think GFLOPs = 2 * GMACs as general each MAC contains one multiplication and one addition.
Roughly GMACs = 2 * GFLOPs
@sovrasov is there a typo here? I did a little reading and it seems that @snownus has it right. In general a multiply-accumulate is one multiplication and one addition, which can each be floating point operations. So 1 GMAC counts as 2 GFLOPs, meaning GMACs = .5 * GFLOPs (I'm not sure if this is what was already meant).
As for fused multiply-add (FMA) it seems that (if it is supported on a given chip/system) the two FLOPs are indeed computed "in a single step" (see here) or "at once" (see here). But this confuses our conversion. Perhaps in the case of FMA it is more accurate to say 1 GMACs = 1 GFLOPs? Hopefully someone with more expertise than me can clarify!
@chnadell yes, you're right! @snownus also figured it out. I'll edit the first post to avoid any future confusions.
@sovrasov, in this case would you consider changing the variable flops
to mac
to avoid confusion?
https://github.com/sovrasov/flops-counter.pytorch/blob/1ad0ed1999620c0170e5854dde39805d30d9b6aa/sample.py#L36
@cassie101 makes sense, I'll change it
Thank you, sir! Whether the output value is directly FLOPs? Do we need to divide it by 2 to get FLOPs?
I am also confused. Shouldn't we multiply it by 2 to get FLOPs?
@code-by-jin yes, exactly, we should multiply GMACS by 2 to get FLOPS
@code-by-jin yes, exactly, we should multiply GMACS by 2 to get FLOPS
Thanks for your response. I checked ResNet-50 using your tool. It has around 4 GMACS, which is close to the number of FLOPS claimed in the resnet paper. Now I am confused, do I really need to multiply your output GMACS by two?
In the original resnet paper authors mixed up macs and flops. As far as I remember, they provided a definition of flops that considers one flop as multiply & add operation. Please check up the paper, correct me if I'm wrong.
Most of modern hardware architectures uses FMA instructions for operations with tensors. FMA computes ax+b as one operation. Roughly GMACs = 0.5 GFLOPs
hi, I've never seen GMACs like this before, it means 10^9 about macs? As far as I know the capital letter before the word is related to FLOPS, not FLOPs and MACs, which is easy to confuse me. Looking forward to your reply, thx
GMACs = 2 * GFLOPs, because MACs includes addition and multiplication operation, GFLOPs only has add operation.
It isn't always true that GMACs = 2 * GFLOPs. For example, two models with the same the GMACS, may have very difference GFLOPS. It depends how you implement model efficiently
@cmj18 @jerryli1981
No, it should be GFLOPs = 2 * GMACs
.
MACs
stands for multiply–accumulate operation that performs a <- a + (b x c)
(they are counted as one operation)
FLOPs
is abbreviation of floating operations which includes mul / add / div ... etc.
(each is separately counted as a single floating operation)
I want to know is there any relation between GOPS(Giga operations per second ) and GFLOPS ,like if i know GFLOPS then ,can i determine GOPS , or are they independent ??
GOPS is a characteristic of hardware, it can only be determined by measurements. ptflops just shows an approximation to theoretical amount of operations required for one forward pass. Time is not considered by ptflops.
Why use the GMACs?GMACs is different with GFLOPs.