Closed ksco closed 2 months ago
Related to this PR:
I think there are certain limitations in the current vector infrastructure -- mainly for opcodes that are essentially element-width-agnostic.
For example, MOVDQA
can be implemented with vle8.v
, vle16.v
, vle32.v
, and vle64.v
, it will be good if we know the predecessor's element width in compile time to choose the best one.
Previously I implemented MOVDQA
with "load whole vector", but that unfortunately only works on vlen=128.
Related to this PR:
I think there are certain limitations in the current vector infrastructure -- mainly for opcodes that are essentially element-width-agnostic.
For example,
MOVDQA
can be implemented withvle8.v
,vle16.v
,vle32.v
, andvle64.v
, it will be good if we know the predecessor's element width in compile time to choose the best one.Previously I implemented
MOVDQA
with "load whole vector", but that unfortunately only works on vlen=128.
You can use a ecanism similar to float/i64/double of Float, but SSE handling is quite different, so an other mecanism should be used.
For vlen>128, we cannot use load/store whole vector instructions! Sorry for this mistake. This PR replaced all the whole vector load/stores with the normal ones, which requires the
vtype
to be set correctly usingSET_ELEMENT_WIDTH()
.