Open Quuxplusone opened 8 years ago
Bugzilla Link | PR27222 |
Status | NEW |
Importance | P normal |
Reported by | Pirama Arumuga Nainar (pirama@google.com) |
Reported on | 2016-04-05 13:25:29 -0700 |
Last modified on | 2020-03-21 08:56:36 -0700 |
Version | trunk |
Hardware | PC Linux |
CC | ahmed@bougacha.org, anton@korobeynikov.info, llvm-bugs@lists.llvm.org, srhines@google.com |
Fixed by commit(s) | |
Attachments | |
Blocks | |
Blocked by | |
See also | PR23531 |
What would you expect to be generated on ARM32 then? fp16 is storage-only type there.
Hi Anton, fp16 is a storage-only type and LLVM already performs operations on fp16 data by promoting them to fp32. For ARM32 with NEON and the 'half' feature, it'd be more efficient to use the vector-variant of VCVT (http://infocenter.arm.com/help/index.jsp?topic=/com.arm.doc.dui0489i/Bcfjicfj.html) instead of the scalar variant. Doing so would produce code similar to the AArch64 output in my initial comment.
The link I pasted in previous comment doesn't directly go to the instruction's reference page. See Section 5.44 in http://infocenter.arm.com/help/topic/com.arm.doc.dui0489i/DUI0489I_arm_assembler_reference.pdf.