esp-rs / esp-idf-svc

Type-Safe Rust Wrappers for various ESP-IDF services (WiFi, Network, Httpd, Logging, etc.)
https://docs.esp-rs.org/esp-idf-svc/
Apache License 2.0
287 stars 161 forks source link

Wifi interrupts delaying RMT code #373

Open DavidVentura opened 4 months ago

DavidVentura commented 4 months ago

I've been having issues running a task that uses the RMT peripheral -- everything was fine until I enabled Wifi, now I'm seeing interrupts.

I've prepared a minimal example which just measures elapsed time between two operations - when an interrupt fires in this interval the measurement shows >3us.

Minimal reproducer:

fn main() -> ! {
    esp_idf_sys::link_patches();

    let peripherals = Peripherals::take().unwrap();
    let modem = peripherals.modem;
    let _w = wifi::configure(SSID, PASS, modem).expect("Could not configure wifi");
    ThreadSpawnConfiguration {
        pin_to_core: Some(Core::Core0),
        ..Default::default()
    }
    .set()
    .expect("Cannot set thread spawn config");
    std::thread::spawn(move || loop {
        let a = Instant::now();
        let delta = a.elapsed();
        if delta > Duration::from_micros(2) {
            println!("Delayed by {:?}", delta);
        }
        sleep(Duration::from_millis(10));
    });

    loop {
        sleep(Duration::from_millis(100));
    }
}

on Core 0, I'm getting ~93 interruptions per second (this is expected).

Feb 24 18:18:06 Delayed by 21µs
Feb 24 18:18:06 I (570) wifi:wifi driver task: 3ffc0aac, prio:23, stack:6656, core=0
Feb 24 18:18:06 I (580) wifi:wifi firmware version: 91b9630
Feb 24 18:18:06 I (580) wifi:wifi certification version: v7.0
Feb 24 18:18:06 I (580) wifi:config NVS flash: enabled 
Feb 24 18:18:06 I (580) wifi:config nano formating: disabled
Feb 24 18:18:06 I (590) wifi:Init data frame dynamic rx buffer num: 32
Feb 24 18:18:06 I (590) wifi:Init static rx mgmt buffer num: 10
Feb 24 18:18:06 I (600) wifi:Init management short buffer num: 32
Feb 24 18:18:06 I (600) wifi:Init dynamic tx buffer num: 32,
Feb 24 18:18:06 I (600) wifi:Init static rx buffer size: 1600
Feb 24 18:18:06 I (610) wifi:Init static rx buffer num: 10
Feb 24 18:18:06 I (610) wifi:Init dynamic rx buffer num: 32
Feb 24 18:18:06 Delayed by 23µs
Feb 24 18:18:06 I (620) wifi_init: rx ba win: 6
Feb 24 18:18:06 I (620) wifi_init: tcpip mbox: 32
Feb 24 18:18:06 I (620) wifi_init: udp mbox: 6
Feb 24 18:18:06 Delayed by 3µs
Feb 24 18:18:06 I (630) wifi_init: tcp mbox: 6
Feb 24 18:18:06 I (630) wifi_init: tcp tx win: 5744
Feb 24 18:18:06 Delayed by 3µs
Feb 24 18:18:06 I (640) wifi_init: tcp rx win: 5744
Feb 24 18:18:06 I (640) wifi_init: tcp mss: 1440
Feb 24 18:18:06 Delayed by 3µs
Feb 24 18:18:06 I (650) wifi_init: WiFi IRAM OP enabled
Feb 24 18:18:06 I (650) wifi_init: WiFi RX IRAM OP enabled
Feb 24 18:18:06 I (660) phy_init: phy_version 4780,16b31a7,Sep 22 2023,20:42:16
Feb 24 18:18:06 I (740) wifi:mode : sta (24:6f:28:79:3a:0c)
Feb 24 18:18:06 I (740) wifi:enable tsf
Feb 24 18:18:06 Delayed by 37µs
Feb 24 18:18:06 Delayed by 12µs
Feb 24 18:18:06 Delayed by 3µs
Feb 24 18:18:06 Delayed by 3µs
Feb 24 18:18:06 Delayed by 3µs
Feb 24 18:18:06 Delayed by 3µs
Feb 24 18:18:06 Delayed by 3µs
Feb 24 18:18:06 Delayed by 3µs
Feb 24 18:18:06 Delayed by 7µs
Feb 24 18:18:06 Delayed by 3µs
Feb 24 18:18:06 Delayed by 3µs
Feb 24 18:18:06 Delayed by 5µs

Changing that code to use Core1, I'd expect 0, but instead I'm getting 1/2 per second: (note the timestamps)

Feb 24 18:21:42 I (3150) wifi:new:<6,0>, old:<1,0>, ap:<255,255>, sta:<6,0>, prof:1
Feb 24 18:21:42 I (3150) wifi:state: init -> auth (b0) 
Feb 24 18:21:42 I (3160) wifi:state: auth -> assoc (0) 
Feb 24 18:21:42 I (3170) wifi:state: assoc -> run (10) 
Feb 24 18:21:42 I (3210) wifi:connected with cuevita, aid = 3, channel 6, BW20, bssid = f4:92:bf:c4:fa:ce
Feb 24 18:21:42 I (3210) wifi:security: WPA2-PSK, phy: bgn, rssi: -64
Feb 24 18:21:42 I (3210) wifi:pm start, type: 1
Feb 24 18:21:42 
Feb 24 18:21:42 I (3220) wifi:<ba-add>idx:0 (ifx:0, f4:92:bf:c4:fa:ce), tid:0, ssn:0, winSize:64
Feb 24 18:21:42 Delayed by 12µs
Feb 24 18:21:42 Delayed by 3µs
Feb 24 18:21:42 I (3250) wifi:AP's beacon interval = 102400 us, DTIM period = 1
Feb 24 18:21:42 Delayed by 3µs
Feb 24 18:21:43 Delayed by 3µs
Feb 24 18:21:43 Delayed by 6µs
Feb 24 18:21:43 Delayed by 3µs
Feb 24 18:21:43 Delayed by 7µs
Feb 24 18:21:44 Delayed by 9µs
Feb 24 18:21:44 I (5210) esp_netif_handlers: sta ip: 192.168.2.158, mask: 255.255.255.0, gw: 192.168.2.1
Feb 24 18:21:44 IP info: IpInfo { ip: 192.168.2.158, subnet: Subnet { gateway: 192.168.2.1, mask: Mask(24) },
 dns: Some(192.168.2.1), secondary_dns: Some(0.0.0.0) }
Feb 24 18:21:44 Wifi configured
Feb 24 18:21:44 Delayed by 3µs
Feb 24 18:21:44 HTTP server up
Feb 24 18:21:44 Delayed by 4µs
Feb 24 18:21:46 Delayed by 3µs
Feb 24 18:21:47 Delayed by 6µsru
Feb 24 18:21:50 Delayed by 3µs
Feb 24 18:21:50 Delayed by 3µs
Feb 24 18:21:50 Delayed by 3µs
Feb 24 18:21:52 Delayed by 3µs
Feb 24 18:21:52 Delayed by 3µs
Feb 24 18:21:53 Delayed by 3µs
Feb 24 18:21:53 Delayed by 3µs
Feb 24 18:21:54 Delayed by 3µs
Feb 24 18:21:54 Delayed by 3µs
Feb 24 18:21:55 Delayed by 3µs
Feb 24 18:21:55 Delayed by 6µs

I've tried to use xTaskCreatePinnedToCore directly, and there's no real difference (same interrupts)

          xTaskCreatePinnedToCore(                                                                           
              Some(f),                                                                                       
              n.as_ptr(),                                                                                    
              10000,                                                                                         
              std::ptr::null_mut(),                                                                          
              24,                                                                                            
              std::ptr::null_mut(),                                                                          
              1, // core id                                                                                  
          );   

I've also tried guarding the measurements with a critical section, but it also did not help:

      loop {                                                                                                 
          let a = Instant::now(); : Instant                                                                  
          {                                                                                                  
              let _c = CS.enter(); : CriticalSectionGuard<'_>                                                
              let delta = a.elapsed(); : Duration                                                            
              if delta > Duration::from_micros(2) {                                                          
                  println!("Delayed by {:?}", delta);                                                        
              }                                                                                              
          }                                                                                                  
          sleep(Duration::from_millis(10)); dur:                                                             
      } 

what can I do to ensure the code running on code 0 is not interrupted by the Wifi interrupts?

Vollbrecht commented 4 months ago

Are you aware of SEGGER sysview? If you add CONFIG_APPTRACE_SV_ENABLE=y in the sdkconfig.default, and connect openocd via jtag to the esp, or using uart, you can pipe that output into the sysview application and see what is happening on the esp. To select the cpu to trace you can set either CONFIG_APPTRACE_SV_DEST_CPU_0=y or CPU_1. For more look here and here.

Also probably best to not use Duration since its a "relative" expansive operation for perf checking. You also may want to look at low level C based calls like -> this though we currently not exporting the esp_hw_support component in esp-idf-sys, though its simple to add.

DavidVentura commented 4 months ago

I was not aware of sysview, this would help debugging :grin: sadly I'm unable to get my sdkconfig.defaults file picked up when using cargo build. I also am not aware of how to use the equivalent of idf.py menuconfig in a 'cargo-first' project

On the Duration - I know it's not the cheapest, but it's clear when seeing a 3us pause that that's too much for a clock measurement.

Vollbrecht commented 4 months ago

make sure you read this as the sdkconfig is definitely working, you may just need to adjust your setup according to the linked docs

DavidVentura commented 4 months ago

that worked for sdkconfig. I'll try and debug what's causing the interrupts with apptrace. To be clear, the issue happens on the mininal example where there is no user code on Core1

DavidVentura commented 4 months ago

I couldn't get apptrace to give me anything at all, I tried a few things blindly and didn't get an improvement;

  1. enabled
    CONFIG_ESP_WIFI_EXTRA_IRAM_OPT=y
    CONFIG_ESP_WIFI_RX_IRAM_OPT=y  
    CONFIG_ESP_WIFI_IRAM_OPT=y      
    CONFIG_RMT_ISR_IRAM_SAFE=y
  2. moved my function to the iram1 llink section
  3. disabled the task watchdog, expecting to get rid of the tick interrupt per this comment but I'm not sure that's what it was referring to
DavidVentura commented 4 months ago

Apparently, the RMT driver stores its current affinity when created, and will later spawn a thread (?) based on that affinity; the current ThreadSpawnConfiguration is not taken into account for that configuration.

Creating the RMT driver on Core1 solves the issue; though it would be nice if this was documented (at least I couldn't find it)