This draft PR is a WIP of adding Armv7M support to slothy. We add the support with Keccak as a first example optimization target.
The new uArch models for Cortex-M4 and Cortex-M7 are still very incomplete, the model for Cortex-M4 is lacking the modeling of ldr instructions pipelining their address and data phases.
We further add the ability to resolve .if ... .else ... .endif constructs in the assembly source code (also nested).
Open ToDos:
[ ] Refine uArch models for M4 and M7
[ ] Clean up and extend Arch model
[ ] Add notion of back-to-back instructions instead of only instruction pairs by data dependency
[ ] Add Kyber and Dilithium NTTs as a fruther example
[ ] Feature idea: Automatically detect unnecessary spills to the stack ("between str and ldr, there are free registers available")
[ ] Optional: Automatic selection between 16-bit and 32-bit variant of an instruction on Cortex-M7 (the 32-bit variant is subject to some constraint adding additional latency as per the M85 SWOG, while on the other hand, using the 15-bit variant may forbid certain register renaming oppurtunities).
This draft PR is a WIP of adding Armv7M support to slothy. We add the support with Keccak as a first example optimization target. The new uArch models for Cortex-M4 and Cortex-M7 are still very incomplete, the model for Cortex-M4 is lacking the modeling of
ldr
instructions pipelining their address and data phases. We further add the ability to resolve.if ... .else ... .endif
constructs in the assembly source code (also nested).Open ToDos:
Further ideas are welcome!