rp-rs / rp-hal

A Rust Embedded-HAL for the rp series microcontrollers
https://crates.io/crates/rp2040-hal
Apache License 2.0
1.46k stars 237 forks source link

Interrupts on RISC-V #847

Open thejpster opened 2 months ago

thejpster commented 2 months ago

) Adds an Xh3irq controller driver ) Adds a default MachineExternal handler which uses the Xh3irq to run whichever interrupts are pending ) Adds a bunch of interrupt control stuff to hal::arch ) Fixes the powman_test example to use the new interrupt APIs - it now works on RISC-V and Arm.

Tested powman_test on a Pico 2 in both riscv32imac-unknown-none-elf and thumbv8m.main-none-eabihf.

thejpster commented 2 months ago

We might want to block this until we've written a new interrupt macro that works like the cortex-m-rt macro and does the static mut transform and type-checks the function signature.

jannic commented 2 months ago

We might want to block this until we've written a new interrupt macro that works like the cortex-m-rt macro and does the static mut transform and type-checks the function signature.

As the rp235x is a multi-core processor, and given that https://github.com/rust-embedded/cortex-m/issues/411 indicates that the transformation made by cortex-m-rt is unsound on multi-core systems, I don't think we should just copy it. Is there a good way to make the transform safe on rp235x? Taking a full critical section on each interrupt entry is probably too expensive?

thejpster commented 2 months ago

So I guess the risk is that both CPUs enter the same interrupt handler at the same time. If one peripheral interrupt signal is unmasked on both CPUs, that could happen. So perhaps the fix is to design the interrupt API to make that impossible (like a bitmask in a static atomic or the watchdog scratch that records which cpu has which interrupt enabled). Or we could ensure that if you enter an interrupt on Core A, it is masked for the duration on Core B - which is nicer than turning all interrupts off or holding a spin lock.

But given 2040 has the same problem the question is whether we want to fix it here or make a plan and fix it later.

9names commented 2 months ago

The user already unmasks the interrupt, and that operation is unsafe. It is almost certainly a bug if they do so on both cores, and we have had no reports of people doing this accidentally.

Until someone comes has a use-case where having both cores handle an interrupt is beneficial, I think trying to solve this is not a good use of our time.

My opinion is we should document "don't do this" and move on with our lives

jannic commented 2 months ago

But given 2040 has the same problem the question is whether we want to fix it here or make a plan and fix it later.

That's an option, I just wouldn't want to spend time on porting the macro only to remove it again later.

Until someone comes has a use-case where having both cores handle an interrupt is beneficial,

Are there situations where it's not avoidable? Any non maskable interrupts or exceptions?

jannic commented 2 months ago

But to be honest, there's a second thing I dislike about the interrupt macro, so I'm biased.

For me, it's totally non-obvious that the macro annotation changes the code inside the function. So if I see this:

#[interrupt]
fn SOME_INTERRUPT() {
  static mut SOMETHING: Option<Whatever> = None;
  [...]
  *SOMETHING = foo(); 
}

I always feel like the usage of the variable doesn't match the declaration. This was very confusing when I was new to embedded rust, and it still sometimes surprises me.

Therefore I'd prefer something more obvious. Perhaps a macro directly on the static mut?

#[interrupt]
fn SOME_INTERRUPT() {
  let something: &mut Option<Whatever> = interrupt_singleton!( didn't think about what comes here yet :Option<Whatever> );
  [...]
  *something = foo(); 
}

Probably needs an unsafe declaration as well, as using this macro outside an interrupt (or in a re-entrant interrupt) would be unsound. So it's far from a finished proposal, of course. This is only meant as an explanation why I don't like the current interrupt macro.

Back to topic: Personally I'd merge the PR as it is, and not wait for an #[interrupt] macro implementation for RISC-V. The current situation is not perfect, but it's good enough, and it's not clear yet how an improvement would look like.

(That said, if someone ported #[interrupt] I wouldn't oppose that either. While I still don't like it, I do value consistency.)

jannic commented 2 months ago

So I guess the risk is that both CPUs enter the same interrupt handler at the same time. If one peripheral interrupt signal is unmasked on both CPUs, that could happen.

BTW, do I understand xh3irq correctly that it already solves this issue? xh3irq::get_next_interrupt() should only return a given interrupt to a single CPU core, even if the interrupt is enabled on both, right?

jannic commented 2 months ago

For the rp2040, we could use separate vector tables for core0 and core1, and let the macro create two distinct functions with their own statics, so even if an interrupt was actually running on both cores at the same time, it would not access the same data. While this would be sound, I don't know if it would be useful: At least the common pattern of initializing the static on the first interrupt would break:

  #[interrupt]
  fn IO_IRQ_BANK0() {
      // The `#[interrupt]` attribute covertly converts this to `&'static mut Option<LedAndButton>`
      static mut LED_AND_BUTTON: Option<LedAndButton> = None;

      // This is one-time lazy initialisation. We steal the variables given to us
      // via `GLOBAL_PINS`.
      if LED_AND_BUTTON.is_none() {
          critical_section::with(|cs| {
              *LED_AND_BUTTON = GLOBAL_PINS.borrow(cs).take();
          });
      }
      [...]

The second core to run this interrupt would always get a None value, and therefore wouldn't be able to do any useful work.

thejpster commented 2 months ago

So I guess the risk is that both CPUs enter the same interrupt handler at the same time. If one peripheral interrupt signal is unmasked on both CPUs, that could happen.

BTW, do I understand xh3irq correctly that it already solves this issue? xh3irq::get_next_interrupt() should only return a given interrupt to a single CPU core, even if the interrupt is enabled on both, right?

I think the UART interrupt is cleared by reading the FIFO which can happen before the ISR ends. So it could re-fire before the ISR ends. Not sure either Interrupt Controller can avoid that.

thejpster commented 2 weeks ago

What do we want to do with this? There's been some changes over in riscv-rt since.

winksaville commented 2 weeks ago

What do we want to do with this? There's been some changes over in riscv-rt since.

I'd like to see interrupts working on risc-v and willing to work on this if you'd like. That said, I'd need some initial guidance as I'm a noob so it will definitely take me some time.

jonathanpallant commented 1 week ago

I rebased and took out any use of the unsound static-mut conversion. Now the interrupts just run in a critical section. Not yet tested on hardware.