Closed CoTinker closed 1 year ago
I think the issue is that AArch64ISelLowering.cpp converts the maxnm intrinsic to ISD::FMAXNUM
case Intrinsic::aarch64_neon_fmaxnm:
return DAG.getNode(ISD::FMAXNUM, SDLoc(N), N->getValueType(0),
N->getOperand(1), N->getOperand(2));
case Intrinsic::aarch64_neon_fminnm:
return DAG.getNode(ISD::FMINNUM, SDLoc(N), N->getValueType(0),
N->getOperand(1), N->getOperand(2));
ISD::FMAXNUM intrinsic is constant folded using this code from APFloat.h
inline APFloat maxnum(const APFloat &A, const APFloat &B) {
if (A.isNaN())
return B;
if (B.isNaN())
return A;
return A < B ? B : A;
}
This does not order -0.0 to be less than 0.0.
The implementation of maxnum matches the documentation for llvm.maxnum here https://llvm.org/docs/LangRef.html#llvm-maxnum-intrinsic
Based on that I think it was incorrect for AArch64 to convert to ISD::FMAXNUM.
@llvm/issue-subscribers-backend-aarch64
I think the issue is that AArch64ISelLowering.cpp converts the maxnm intrinsic to ISD::FMAXNUM
case Intrinsic::aarch64_neon_fmaxnm: return DAG.getNode(ISD::FMAXNUM, SDLoc(N), N->getValueType(0), N->getOperand(1), N->getOperand(2)); case Intrinsic::aarch64_neon_fminnm: return DAG.getNode(ISD::FMINNUM, SDLoc(N), N->getValueType(0), N->getOperand(1), N->getOperand(2));
ISD::FMAXNUM intrinsic is constant folded using this code from APFloat.h
inline APFloat maxnum(const APFloat &A, const APFloat &B) { if (A.isNaN()) return B; if (B.isNaN()) return A; return A < B ? B : A; }
This does not order -0.0 to be less than 0.0.
The implementation of maxnum matches the documentation for llvm.maxnum here https://llvm.org/docs/LangRef.html#llvm-maxnum-intrinsic
Based on that I think it was incorrect for AArch64 to convert to ISD::FMAXNUM. thank you very much!
I think the issue is that AArch64ISelLowering.cpp converts the maxnm intrinsic to ISD::FMAXNUM
case Intrinsic::aarch64_neon_fmaxnm: return DAG.getNode(ISD::FMAXNUM, SDLoc(N), N->getValueType(0), N->getOperand(1), N->getOperand(2)); case Intrinsic::aarch64_neon_fminnm: return DAG.getNode(ISD::FMINNUM, SDLoc(N), N->getValueType(0), N->getOperand(1), N->getOperand(2));
ISD::FMAXNUM intrinsic is constant folded using this code from APFloat.h
inline APFloat maxnum(const APFloat &A, const APFloat &B) { if (A.isNaN()) return B; if (B.isNaN()) return A; return A < B ? B : A; }
This does not order -0.0 to be less than 0.0. The implementation of maxnum matches the documentation for llvm.maxnum here https://llvm.org/docs/LangRef.html#llvm-maxnum-intrinsic Based on that I think it was incorrect for AArch64 to convert to ISD::FMAXNUM. thank you very much!
but why maxnum return right result with option -O0
My best guess is that -O0 worked because it didn't get constant folded. So the ISD::MAXNUM was turned into a the maxnm instruction which did the right thing. It's legal to convert ISD::MAXNUM to the maxnm instruction but not the other way around.
demo.c
compile command
clang demo.c -static -fno-caret-diagnostics -march=armv8-a -mfloat-abi=softfp -o demo
right output
but with the optimize option -O1 -O2 -O3 etc. the output will be wrong:
I guess it's the noinline command
__attribute__((noinline))
not work: add the compile option-mllvm -opt-bisect-limit=56
the output is right, but with the option-mllvm -opt-bisect-limit=57
,the output is wrong. my clang version: 15.0.4opt pass