simd-everywhere / simde

Implementations of SIMD instruction sets for systems which don't natively support them.
https://simd-everywhere.github.io/blog/
MIT License
2.33k stars 242 forks source link

NEON: implement all intrinsics supported by architecture A64-remaining part #1093

Closed yyctw closed 10 months ago

yyctw commented 10 months ago

Hi all, this is Eric from Andes Technology Corporation. This PR is the remaining part of the previous PR and includes the following:

Implement all poly-related types using uint. Implement all functions related to the poly type (with test cases). Implement all functions related to the bf16 type (without test cases). Add 1035 initial implementations and corresponding test cases in 137 families which are listed below: add, aes, bsl, ceq, ceqz, cmla, cmla_rot180, cmla_rot270, cmla_rot90, cnt, combine, copy_lane, crc32, create, cvt, div, dot, dot_lane, dup_lane, dup_n, eor, ext, fmlal, fmlsl, get_high, get_lane, get_low, ld1, ld1_dup, ld1_lane, ld1_x2, ld1_x3, ld1_x4, ld1q_x2, ld1q_x3, ld1q_x4, ld2, ld2_dup, ld2_lane, ld3, ld3_dup, ld3_lane, ld4, ld4_dup, ld4_lane, maxnm, maxnmv, maxv, minnm, minnmv, minv, mmlaq, mul, mull, mull_high, mull_high_lane, mull_high_n, mulx, mulx_lane, mulx_n, mvn, padd, pmax, pmaxnm, pmin, pminnm, qmovun_high, qrdmlah, qrdmlah_lane, qrdmlsh, qrdmlsh_lane, qrdmulh_lane, qshlu_n, qshrun_high_n, qshrun_n, qtbl, qtbx, rax, rbit, recps, recpx, reinterpret, rev16, rev32, rev64, rnd, rnd32x, rnd32z, rnd64x, rnd64z, rnda, rndi, rndm, rndp, rndx, set_lane, sha1, sha256, sha512, shll_high_n, shrn_high_n, shrn_n, sli_n, sm3, sm4, sri_n, st1, st1_lane, st1_x2, st1_x3, st1_x4, st1q_x2, st1q_x3, st1q_x4, st2, st2_lane, st3, st3_lane, st4, st4_lane, subhn_high, sudot_lane, tbl, tbx, trn, trn1, trn2, tst, types, usdot, usdot_lane, uzp, uzp1, uzp2, zip, zip1, zip2

Thanks for reading and any recommendations are welcome:tada::tada::tada:!

mr-c commented 10 months ago

Some thoughts on the AES code

If need be, we can split out the AES code to a separate PR to keep going here without it.

yyctw commented 10 months ago

Some thoughts on the AES code

If need be, we can split out the AES code to a separate PR to keep going here without it.

Ok, I have removed it.

mr-c commented 10 months ago

Thank you @yyctw !

MSVC is still unhappy about something: https://ci.appveyor.com/project/nemequ/simde/builds/48387860/job/84vqj1g1r974k4sk#L2557

yyctw commented 10 months ago

Thank you @yyctw !

MSVC is still unhappy about something: https://ci.appveyor.com/project/nemequ/simde/builds/48387860/job/84vqj1g1r974k4sk#L2557

It should already be fixed.

mr-c commented 10 months ago

Thank you @yyctw ! MSVC is still unhappy about something: https://ci.appveyor.com/project/nemequ/simde/builds/48387860/job/84vqj1g1r974k4sk#L2557

It should already be fixed.

https://ci.appveyor.com/project/nemequ/simde/builds/48389095/job/2ifnp7paf0l7m9t0#L2575 :-/

yyctw commented 10 months ago

Thank you @yyctw ! MSVC is still unhappy about something: https://ci.appveyor.com/project/nemequ/simde/builds/48387860/job/84vqj1g1r974k4sk#L2557

It should already be fixed.

https://ci.appveyor.com/project/nemequ/simde/builds/48389095/job/2ifnp7paf0l7m9t0#L2575 :-/

https://ci.appveyor.com/project/nemequ/simde/builds/48394475 :-)

yyctw commented 10 months ago

Review of the first 50 files (thanks!)

Fixed it. Thanks for your review!

yyctw commented 10 months ago

@mr-c All comments have been addressed. Thank you very much for your patience and review!

mr-c commented 10 months ago

Good news: when all my indicated changes are made, the new tests pass on my mobile phone (-march=cortex-a76.cortex-a55) using GCC 13.2 .

yyctw commented 10 months ago

I had added commit cde4b78 ; please restore that

Sorry for the overwrite. I don't have a local commit record, so I can't restore your commit. Could you please restore this commit yourself? Thank you.

mr-c commented 10 months ago

TL;DR: SIMDe currently implements 6443 out of 6670 (96.60%) NEON functions. If you don't count bf16 types, it's 6443 / 6466 (99.64%).

!!!

Thank you @yyctw !

https://github.com/simd-everywhere/implementation-status/commit/80829f290d72b02eed8f912a9211a69cdc55393d

yyctw commented 10 months ago

No problem. Thank you @mr-c for your review!