Open Quuxplusone opened 8 years ago
Bugzilla Link | PR27894 |
Status | NEW |
Importance | P normal |
Reported by | Sanjay Patel (spatel+llvm@rotateright.com) |
Reported on | 2016-05-26 11:56:20 -0700 |
Last modified on | 2017-12-17 07:39:43 -0800 |
Version | trunk |
Hardware | PC All |
CC | antoshkka@gmail.com, atrick@apple.com, elena.demikhovsky@intel.com, hfinkel@anl.gov, llvm-bugs@lists.llvm.org, llvm-dev@redking.me.uk, sanjoy@playingwithpointers.com |
Fixed by commit(s) | |
Attachments | |
Blocks | PR35611 |
Blocked by | |
See also | PR27881, PR27899 |
Do you suggest to convert this sequence to fmul? Is it the right thing to do under fast-math?
(In reply to comment #1)
> Do you suggest to convert this sequence to fmul? Is it the right thing to do
> under fast-math?
I think so (although I don't know which pass is responsible for the transform
yet).
For integers, we have to guard against a negative value for 'n':
define i32 @multiply_the_hard_way(i32 %n) {
%cmp = icmp sgt i32 %n, 0
%mul = mul i32 %n, 7
%sel = select i1 %cmp, i32 %mul, i32 0
ret i32 %sel
}
So for floats, it should be similar?
define float @multiply_the_hard_way(i32 %n) {
%cmp = icmp sgt i32 %n, 0
%n_as_float = sitofp i32 %n to float
%mul = fmul float %n_as_float, 7.0
%sel = select i1 %cmp, float %mul, float 0.0
ret i32 %sel
}
(In reply to comment #2)
> ret i32 %sel
Typo: i32 --> float
Note that if 'n' is over ~2^23, we wouldn't have an exact result. I haven't looked to see how the add loop would diverge from an fmul in that case. The difference could be judged too much even for fast-math...although at first glance, I would think the multiply would be the more accurate answer!
Induction variable simplification will effectively remove this loop if SCEV
expressions are formed for the floating point operations.
The question is
- is this a desirable optimization in real code
- is this transformation sound
It's not clear to me that fast-math means that floating point operations can be
treated as scaled, infinite precision integer operations, which is effectively
what you're doing. This is way beyond reassociation.
(In reply to comment #5)
> Induction variable simplification will effectively remove this loop if SCEV
> expressions are formed for the floating point operations.
>
> The question is
> - is this a desirable optimization in real code
I think patterns like what we have in bug 27881 are common, so IMO, yes.
> - is this transformation sound
>
> It's not clear to me that fast-math means that floating point operations can
> be treated as scaled, infinite precision integer operations, which is
> effectively what you're doing. This is way beyond reassociation.
Agreed. This definitely pushes the (unspecified) bounds of fast-math. I wonder
if the fact that the SCEV likely produces a more accurate answer than what the
program would produce unoptimized should be taken into consideration.
(In reply to comment #6)
> (In reply to comment #5)
> > Induction variable simplification will effectively remove this loop if SCEV
> > expressions are formed for the floating point operations.
> >
> > The question is
> > - is this a desirable optimization in real code
>
> I think patterns like what we have in bug 27881 are common, so IMO, yes.
>
> > - is this transformation sound
> >
> > It's not clear to me that fast-math means that floating point operations can
> > be treated as scaled, infinite precision integer operations, which is
> > effectively what you're doing. This is way beyond reassociation.
>
> Agreed. This definitely pushes the (unspecified) bounds of fast-math. I
> wonder if the fact that the SCEV likely produces a more accurate answer than
> what the program would produce unoptimized should be taken into
> consideration.
So this is tricky. First, because fast-math already does this sometimes (just
because of reassociation), but, second, because there's no sequence of local
transformations the user could do to make the unoptimized program produce the
same result (unlike reassociation).
In the end, I'm okay with performing this transformation under fast-math. I do
believe, however, it depends on how we define it. We probably should try to
define it. Maybe something like this, "-ffast-math enables the compiler to
perform transformations on floating-point calculations that are valid when
treating the floating-point values as mathematical real numbers, but not
semantics preserving when considering the floating-point values' machine
representations. It also enables the compiler to perform transformations
resulting in floating-point calculations computing fewer correct bits than they
would otherwise."