arduino / ArduinoCore-renesas

MIT License
109 stars 74 forks source link

Lazy stacking of float context not enabled, slowing down all exception processing. #103

Closed WestfW closed 1 year ago

WestfW commented 1 year ago

Lazy stacking of floating point context is enabled by the LSPACT bit in the FPU->FPCCR register; It causes the floating point registers to be stacked during an exception ONLY if a floating point instruction is used during the exception. Since most except code does not use floats, saving the floating point context slows things down pretty significantly, for no reason.

Here's a sketch showing the issue:

char spbuffer[100];

void setup() {
  Serial.begin(9600);
  while (!Serial)
    ;

  sprintf(spbuffer, "FPU (Float Processing Unit info @%08lx)\n", (uint32_t)FPU);
  Serial.print(spbuffer);

  sprintf(spbuffer, "  %6s@%08lx=0x%08lx%s\n", "FPCCR", (uint32_t)&FPU->FPCCR,
          FPU->FPCCR, " FP Context Control Register");
  Serial.print(spbuffer);
  if (FPU->FPCCR & FPU_FPCCR_LSPACT_Msk) {
    Serial.print("    (Lazy State Preservation is ON)\n");
  } else {
    Serial.print("    (Lazy State Preservation is OFF)\n");
  }
}

void loop() {}
facchinm commented 1 year ago

Hi Bill, I think that you are misunderstanding how the LSPACT bit is working; basically it gets set when, inside the IRQ, you are not getting any fpu usage. The right way to enable it is FPU->FPCCR |= FPU_FPCCR_ASPEN_Msk | FPU_FPCCR_LSPEN_Msk; , which we already set.

The demonstrator sketch (for the Uno R4 WiFi since we are printing inside the IRQ) is like

void print_fpu_irq() {
  printf("context: irq \t");
  print_fpu();
}

void print_fpu() {

  printf("FPU (Float Processing Unit info @%08lx)\n", (uint32_t)FPU);
  printf("  %6s@%08lx=0x%08lx%s\n", "FPCCR", (uint32_t)&FPU->FPCCR,
         FPU->FPCCR, " FP Context Control Register");
  if (FPU->FPCCR & FPU_FPCCR_LSPACT_Msk) {
    printf("    (Lazy State Preservation is ON)\n");
  } else {
    printf("    (Lazy State Preservation is OFF)\n");
  }
}

void setup() {
  Serial.begin(9600);
  while (!Serial)
    ;

  attachInterrupt(2, print_fpu_irq, FALLING);
}

void loop() {
  printf("context: user \t");
  print_fpu();
  delay(1000);
}
WestfW commented 1 year ago

You are correct, and lazy stacking does seem to be turned on after all. Here's the corrected sketch, just for completeness.

char spbuffer[100];

void setup() {
  Serial.begin(9600);
  while (!Serial)
    ;

  sprintf(spbuffer, "FPU (Float Processing Unit info @%08lx)\n", (uint32_t)FPU);
  Serial.print(spbuffer);

  sprintf(spbuffer, "  %6s@%08lx=0x%08lx%s\n", "FPCCR", (uint32_t)&FPU->FPCCR,
          FPU->FPCCR, " FP Context Control Register");
  Serial.print(spbuffer);
  if (FPU->FPCCR & FPU_FPCCR_LSPEN_Msk) {
    Serial.print("    (Lazy State Preservation is ON)\n");
  } else {
    Serial.print("    (Lazy State Preservation is OFF)\n");
  }
}

void loop() {}

And its output (the leading 0xC is the important part.):

FPU (Float Processing Unit info @e000ef30)
   FPCCR@e000ef34=0xc0000018 FP Context Control Register
    (Lazy State Preservation is ON)