Added TWeightedColor.Add and TWeightedColor.Subtract methods for w=1 case.
TWeightedColor.GetColor optimizations:
32bit CPUs: Prevent the _lldiv 64bit call if possible
Use lookup value for "1/fAlphaTot" if fAlphaTot is in range 1 and 65535
Skip calculation if the color channel is zero
Replace ClampByte(double) by LimitByte(Integer) because the value can't become less than zero, so we don't need to check the lower limit.
The color channel calculation is done in the parameter of Round(), allowing the compiler to generate better optimized code (no temporary stack variable has to be used).
Removed the "res: TARGB absolute Result" because it forces the "Result" to be stored on the stack instead of a CPU register, slowing it down. (Register shifts are faster than memory accesses)
Fixed CPUX86 definition for Delphi older than XE2.
According to the official Delphi documentation, CPU386 is also defined for the DCCOSX64 platform what doesn't match the CPUX86 platform.
Thanks again Andreas!
There are a couple of minor tweaks that I'll do after merging (eg the B variable in TWeightedColor.GetColor I'd rename so I don't confuse it with the B color channel).
Added TWeightedColor.Add and TWeightedColor.Subtract methods for w=1 case.
TWeightedColor.GetColor optimizations:
Fixed CPUX86 definition for Delphi older than XE2. According to the official Delphi documentation, CPU386 is also defined for the DCCOSX64 platform what doesn't match the CPUX86 platform.