Closed wangpc-pp closed 8 months ago
This isnโt a good use of a GPR. Reads of static CSRs like vlenb are very fast on competent microarchitectures. Most functions need to read vlenb zero times, with one time being the close follow-up.
Even if it were, who's going to use it? Most Unix environments are already using GP for relaxation, and Android is using GP for the shadow stack.
And yeah, as a read-only constant CSR it's pretty easy to make reading it fast. Even less competent microarchitectures should be able to manage that one. If not, the rest of their vector implementation probably isn't usefully performant either.
Thanks for comments! I think it's not an issue now.
note that the majority of those vlenb reads is just missing optimization in current compilers.
There are a lot of
csrr reg, vlenb
when RVV is enabled for auto-vectorization (no matter VLS or VLA). Thesecsrr
s are used to calculate scalable vectors' size or to spill/reload. One of the simplest way to solve this problem that I can imagine is to use x3 as global VLENB that will be set during runtime initialization. Then the program can just use x3 as VLENB without reading it viacsrr
instructions. And of course, we need add a new kind toTag_RISCV_x3_reg_usage
. Is this feasible? Disclaimer (๐): This may be some kind of abusement of x3.