Open Darshvino opened 1 year ago
@llvm/issue-subscribers-backend-aarch64
vmulq_s16 should be available through arm_neon.h: https://godbolt.org/z/h43cPMfTK.
If you mean an @llvm.aarch64
intrinsic for it, then you can just use a mul
instruction directly.
Hi @davemgreen,
Thanks a lot for your reply.
I tried to use this intrinsic: "llvm.aarch64.neon.mul" but I was getting the below error:
AssertionError: llvm.aarch64.neon.mul is not an LLVM intrinsic
I am not sure it is with respect to LLVM version of something.
It would be really great and helpful if you can help me to resolve the above error.
Thanks
I just mean an llvm mul instruction: %3 = mul <8 x i16> %1, %0 For the simple operation where the llvm instruction is equivalent to the neon intrinsic, we can just use the instruction directly and get the benefits of llvm being able to optimize them as it would any other mul.
Yeah got it @davemgreen,
But can we get something like this: llvm.aarch64.neon..... for a vmulq_u16 instead of the instruction directly?
I am a bit desperately looking for the above.
Hi @davemgreen,
look forward to your reply.
Thanks
There is no llvm.aarch64.neon equivalent for mul. It shouldn't be needed.
Perhaps taking a step back - what are you trying to do? Write llvm IR directly (in text form), or generating the instructions through the C/C++? Is this for some other frontend? If you are creating instructions with an IRBuilder you should be able to use CreateMul, for example.
Hi @davemgreen,
Thank you again for your reply.
Actually, I am working with TVM. I am trying to add a custom operator in TVM and it allows us to define an intrinsics (via Tensorize schedule) to use instead of leaving LLVM to directly generate the assembly code , here is one such example: https://github.com/apache/tvm/blob/f7dfef4cdea3a6ca96af7869e4457a4de0525eab/python/tvm/topi/arm_cpu/tensor_intrin.py#L101. And I think it allows only to use the Intrinsics instead of the instruction directly, but I had created an issue in TVM asking if we can use the instruction directly: https://github.com/apache/tvm/issues/13850
Hi LLVM team,
I am trying to find a corresponding Aarch64 intrinsic for vmulq_u16 , but unfortunately, I am not able to find the matching LLVM aarch64 intrinsic for it. It would be really great if anyone can assist me in finding it. Below is the description of vmulq_u16:
uint16x8_t(output type) = vmulq_u16(uint16x8_t a, uint16x8_t b)
which also can be found here: https://developer.arm.com/architectures/instruction-sets/intrinsics/#q=vmulq_u16
Thanks and look forward to your reply!