Open AjaySingh40 opened 2 months ago
Checking for some info. Also shared on Arm Discord.
For a non-sequential load you would need to use gathers, which SVE supports. This will be less efficient than sequential loads, so it's often faster to transpose an input matrix to avoid the need for gathers.
The best way of transposing depends on what you are doing - transposing blocks that fit into the cache would be better than transposing a whole matrix that spills out of the cache, for example.
Hello all, Is it possible to access elements of a matrix row wise using sve. If yes please provide the link for the docs or help me with that . I didn't find any document for that. Thank You.