Open jnk0le opened 1 year ago
I found the a similar command line flag in GCC for ARC: -mirq-ctrl-saved=
.
As I don't see why we would have different ISRs with different prestacked register lists, I would prefer a command line switch over a function attribute.
Command line flag does prevent the auxiliary uses as well as e.g. extra shadow registers registers at highest nesting level.
And the custom irq controllers reusing standard return mechanism (return pattern or lsb in ra
as in TEIC) will require yet another attribute to work with this flag.
extra shadow registers registers at highest nesting level
Do you have a link to an MCU datasheet that describes such an ISR behavior (different register banking behavior on ISR entry based on nesting level)?
To avoid copy-paste mistakes and to improve portability you would probably not use the prestacked
attribute directly, but hide it in an macro. E.g.:
#if defined IRQ_FOO
#define MY_ISR_ATTRIBUTES `__attribute__((interrupt, prestacked("x1,x5-x7,x10-x15")))`
#elif defined IRQ_BAR
#define MY_ISR_ATTRIBUTES `__attribute__((interrupt, prestacked("x7,x10-x15")))`
#endif
...
void USBHS_IRQHandler (void) MY_ISR_ATTRIBUTES;
void USBHS_IRQHandler (void)
{
...
}
...and then we already have a global configuration that can be controlled via command line flags (here -DIRQ_FOO
or -DIRQ_BAR
). That would be equivalent to use __attribute((interrupt))
and -mirq-hw-stacked=x1,x5-x7,x10-x15
.
Also, I'm not convinced that full flexibility for all possible HW stacking schemes is really needed, because I believe that features should be derived from available HW.
So I would prefer a list of HW stacking schemes. E.g. -mirq-hw-stacked=[none,all,caller-saved,integer-caller-saved]
.
Do you have a link to an MCU datasheet that describes such an ISR behavior (different register banking behavior on ISR entry based on nesting level)?
ch32v307 has 8 nesting levels with shadow registers only on the 3 lowest levels (should be on highest IMO) https://github.com/openwch/ch32v307/blob/main/Datasheet/README.md
Also c2000 arch has shadow regfile of floating point registers designated for HPIs, but being a typical CISC, it switches it by separate instructions, emmited by compiler interrupt (HPI) attribute. Even though those could be switched automatically.
Also many other archs (like ARM32 FIQ/IRQ and clones) that are stabilised by compiler attributes and central governance.
To avoid copy-paste mistakes and to improve portability you would probably not use the prestacked attribute directly, but hide it in an macro. E.g.:
I think that the vendors could provide those definitions in device specific headers e.g.
#define MYVENDOR_IRQ_FAST __attribute__((interrupt, prestacked("x1,x5-x7,x10-x15")))
#define MYVENDOR_IRQ_REGULAR __attribute__((interrupt, prestacked("x10-x15")))
E.g.
-mirq-hw-stacked=[none,all,caller-saved,integer-caller-saved]
Those will bind any given core to the ABI it was designed for. (i.e. there could appear an IPRA optimized ABI, or just getting rid of tp
- not possible to use because HW stacker was designed for ilp32[e])
Also, note that the shadow registers (and decoding them) are quite expensive and typical DSP code doesn't need that many registers (3p3z irq is like ~6-7 regs total (gcc compiled)). So vendors are likely to seek for balance.
+1 for unified attribute for modeling that, I am tired to maintain __attribute__((interrupt("SiFive-CLIC-preemptible")))
on downsream :P also not interested to maintain many different attribute for those interrupt stuffs, this proposal seem a solution for that :)
Also I would prefer using attribute over command line since some low level run time stuffs may support different HW platform, typically that will support different platform in different files*
, and it would be easier to control that within source file rather than manage those stuffs on the build system/build script level.
*
: One of example is SiFive's FESDK: https://github.com/sifive/freedom-e-sdk, I also vaguely remember many different project using similar scheme.
Do you have a link to an MCU datasheet that describes such an ISR behavior (different register banking behavior on ISR entry based on nesting level)?
ch32v307 has 8 nesting levels with shadow registers only on the 3 lowest levels (should be on highest IMO) https://github.com/openwch/ch32v307/blob/main/Datasheet/README.md
So they have three sets of shadow registers for caller-saved registers. The fourth level of interrupt nesting will not save the registers anymore. However, this has to be handled in SW anyway, because it is the same ISR that is called. And the behavior is identical for all ISRs, so the attribute does not help to solve this aspect.
Also c2000 arch has shadow regfile of floating point registers designated for HPIs, but being a typical CISC, it switches it by separate instructions, emmited by compiler interrupt (HPI) attribute. Even though those could be switched automatically.
Also many other archs (like ARM32 FIQ/IRQ and clones) that are stabilised by compiler attributes and central governance.
Afaik ARM7 has shadow registers, but only one set of them (per exception type). So there is not change in the behavior in case of nesting (registers will simply be overwritten).
To avoid copy-paste mistakes and to improve portability you would probably not use the prestacked attribute directly, but hide it in an macro. E.g.:
I think that the vendors could provide those definitions in device specific headers e.g.
#define MYVENDOR_IRQ_FAST __attribute__((interrupt, prestacked("x1,x5-x7,x10-x15"))) #define MYVENDOR_IRQ_REGULAR __attribute__((interrupt, prestacked("x10-x15")))
E.g.
-mirq-hw-stacked=[none,all,caller-saved,integer-caller-saved]
Those will bind any given core to the ABI it was designed for. (i.e. there could appear an IPRA optimized ABI, or just getting rid of
tp
- not possible to use because HW stacker was designed for ilp32[e])
"there could appear" -> does not exist Hard to discuss a feature, which does not exist in real HW. But I agree that the list would have to be extended if such hardware comes to the market. And using registers instead of named register sets addresses this.
+1 for unified attribute for modeling that, I am tired to maintain
__attribute__((interrupt("SiFive-CLIC-preemptible")))
on downsream :P also not interested to maintain many different attribute for those interrupt stuffs, this proposal seem a solution for that :)Also I would prefer using attribute over command line since some low level run time stuffs may support different HW platform, typically that will support different platform in different files
*
, and it would be easier to control that within source file rather than manage those stuffs on the build system/build script level.
Is there a way to avoid this? Even the mentioned freedom-e-sdk needs this configuration (https://github.com/sifive/freedom-e-sdk/blob/master/bsp/sifive-hifive-unmatched/settings.mk).
In there we have:
I would claim that the ISR HW-ABI would fit there as well.
The reason, why I don't like function attributes is, that they have be set for each ISR, although there is no evidence that we need two ISRs with different attributes. And this could lead again to subtle issues (resulting in forum questions with responses like "check if all your ISRs have the correct function attributes set").
Main issue with CLIC is that in addition to __attribute__((interrupt("SiFive-CLIC-preemptible")))
, the highest nesting level of CLIC can be just a simple __attribute__((interrupt))
. And there should be also yet another attribute (actually two: preemtible and highest nesting) for horizontal nesting (additional use of xscratchcsw
).
Nothing can be done about it.
Do you have a link to an MCU datasheet that describes such an ISR behavior (different register banking behavior on ISR entry based on nesting level)?
ch32v307 has 8 nesting levels with shadow registers only on the 3 lowest levels (should be on highest IMO) https://github.com/openwch/ch32v307/blob/main/Datasheet/README.md
So they have three sets of shadow registers for caller-saved registers. The fourth level of interrupt nesting will not save the registers anymore. However, this has to be handled in SW anyway, because it is the same ISR that is called. And the behavior is identical for all ISRs, so the attribute does not help to solve this aspect.
I don't quite understand, WCH PFIC similarly to NVIC (which appears more frequently in their headers than "PFIC" BTW) allows to configure any numbers of (vectored) handlers to any available nesting level.
So if SPI and I2C handlers are configured to 3rd nesting level, TIM3 and TIM4 are configured at 4th nesting level, the attributes look like this:
__attribute__((interrupt, prestacked("x1, x5-x7,x10-x17, x28-x31"))) void SPI1_IRQHandler();
__attribute__((interrupt, prestacked("x1, x5-x7,x10-x17, x28-x31"))) void I2C1_EV_IRQHandler();
__attribute__((interrupt)) void TIM3_IRQHandler();
__attribute__((interrupt)) void TIM4_IRQHandler();
Afaik ARM7 has shadow registers, but only one set of them (per exception type). So there is not change in the behavior in case of nesting (registers will simply be overwritten).
PIC32 should have more sets (cover all GPRs, amount is implementation specific) https://microchipdeveloper.com/32bit:mz-arch-exceptions-usage#AssignShadowRegisterSets
"there could appear" -> does not exist But I agree that the list would have to be extended if such hardware comes to the market. And using registers instead of named register sets addresses this.
I think that the vendors don't want to find themselves in WCH situation with a new stacker or shadow regs.
Also adding to compiler alias lists means:
The preserving (like in ARM FIQ) shadow registers can be also shared by software e.g. assembly handler reserves some of the shadow regs, and the remaining can be used normally by C handlers (even by different nesting levels if shadow regs are common across them).
Main issue with CLIC is that in addition to
__attribute__((interrupt("SiFive-CLIC-preemptible")))
, the highest nesting level of CLIC can be just a simple__attribute__((interrupt))
. And there should be also yet another attribute (actually two: preemtible and highest nesting) for horizontal nesting (additional use ofxscratchcsw
).Nothing can be done about it.
Do you have a link to an MCU datasheet that describes such an ISR behavior (different register banking behavior on ISR entry based on nesting level)?
ch32v307 has 8 nesting levels with shadow registers only on the 3 lowest levels (should be on highest IMO) https://github.com/openwch/ch32v307/blob/main/Datasheet/README.md
So they have three sets of shadow registers for caller-saved registers. The fourth level of interrupt nesting will not save the registers anymore. However, this has to be handled in SW anyway, because it is the same ISR that is called. And the behavior is identical for all ISRs, so the attribute does not help to solve this aspect.
I don't quite understand, WCH PFIC similarly to NVIC (which appears more frequently in their headers than "PFIC" BTW) allows to configure any numbers of (vectored) handlers to any available nesting level.
So if SPI and I2C handlers are configured to 3rd nesting level, TIM3 and TIM4 are configured at 4th nesting level, the attributes look like this:
__attribute__((interrupt, prestacked("x1, x5-x7,x10-x17, x28-x31"))) void SPI1_IRQHandler(); __attribute__((interrupt, prestacked("x1, x5-x7,x10-x17, x28-x31"))) void I2C1_EV_IRQHandler(); __attribute__((interrupt)) void TIM3_IRQHandler(); __attribute__((interrupt)) void TIM4_IRQHandler();
I have a different understanding of that mechanism. We have:
The HPE mechanism supports to interrupt up to three interrupts until it runs out of shadow registers. The priority matters in this context just to define which interrupts can be interrupted.
So in your example TIM3 and TIM4 can use the same prestacked/HPE attribute. This works fine unless it happens that they are executed as a nested interrupt of depth 4 or higher (e.g. USB -> SPI -> I2C -> TIM3). But this is a dynamic aspect, that you cannot predict in general.
So in your example you would use preventive SW-stacking for IRQs with high priorities. This has the implication, that you will observe the worst IRQ latencies for high priority interrupts (in most cases HPE will save the registers and the ISR itself will save them again), which contradicts the purpose of HPE.
Better solutions would be:
Afaik ARM7 has shadow registers, but only one set of them (per exception type). So there is not change in the behavior in case of nesting (registers will simply be overwritten).
PIC32 should have more sets (cover all GPRs, amount is implementation specific) https://microchipdeveloper.com/32bit:mz-arch-exceptions-usage#AssignShadowRegisterSets
PIC32 defines IPLx[SRSy|SOFT|AUTO]
per interrupt (needs to be configured in the IRQ controller and the compiler needs to know what to do in the ISR proloue/epilogue).
So this is a valid precedent where each ISR can be configured to have a different ABI.
After seeing this, I'm fine with the function attribute approach. Thanks for providing the reference to that doc!
"there could appear" -> does not exist But I agree that the list would have to be extended if such hardware comes to the market. And using registers instead of named register sets addresses this.
I think that the vendors don't want to find themselves in WCH situation with a new stacker or shadow regs.
Also adding to compiler alias lists means:
- no support by "older" compiler builds
- requires blessing from compiler maintainers to get upstreamed, otherwise we are back to "WCH-Interrupt-Fast" situation.
I don't know the backstory of the WCH situation. But yes, as said before, using old toolchains with new devices is a clear benefit.
That said, my concerns are properly addressed and I'm fine with this PR. Thanks for providing the relevant context!
Also, the CLIC has already been extended with hardware stacking. [1] uses background lazy stacking, configurable to psABI and "EABI". Requires new attribute due to new instructions and changed way of handling prologues. [2] is paywalled but according to summary in [1] it just adds a classic HW stacker to CLIC. (no new attribute with prestacked annotation)
[1] https://arxiv.org/pdf/2311.08320 [2] DOI 10.1109/ICICM54364.2021.9660345
Corner case for "IPRA" alt usage: annotation doesn't cover stack passed arguments, which are caller saved (documented on arm and defacto on risc-v)
for further rationale see: https://discourse.llvm.org/t/rfc-prestacked-annotation-to-solve-risc-v-interrupt-stacking-mess/74120