espressif / esp-matter

Espressif's SDK for Matter
Apache License 2.0
706 stars 161 forks source link

using esp_matter_controller for commissioning causes stack overflows (CON-1395) #1133

Open pavel808 opened 1 month ago

pavel808 commented 1 month ago

I am using the esp_mater_controller in my application to hopefully be able to commission and control devices.

My application is running on an ESP32-S3 on a Thread Border Router device.

Whenever I call esp_matter::controller::pairing_code_thread out of context I get a stack overflow. I started by trying to call it within a HTTP post handler. I then moved this into a separate thread as follows :

bool to_commission = false;
// pairing_code defined elsewhere as a std::string

static void *matter_ctrl_func(void * arg)
{
    ESP_LOGI(TAGD, "matter_ctrl_func running");

    while (true)
    {
        if (to_commission == true)
        {
           ESP_LOGI(TAGD, "---------------ABOUT TO COMMISSION DEVICE---------------------------------");

           esp_matter::lock::chip_stack_lock(portMAX_DELAY);

           esp_err_t err = esp_matter::controller::pairing_code_thread(0x6F5, pairing_code.c_str(), thread_dataset, sizeof(thread_dataset));

            if (err != ESP_OK)
            {
                ESP_LOGE(TAGD, "ERROR! Unable to commission device : %d", err);
            }

            esp_matter::lock::chip_stack_unlock();

            ESP_LOGI(TAGD, "------------------------ DEVICE COMMISSIONED ----------------------------------------------------------------");
            to_commission = false;
        }

        std::this_thread::sleep_for(std::chrono::seconds(7));
    } // end while
}

Here is how the thread is created from the main function. If I try and increase the stack size to something bigger than the default, say 8000, then the thread can't be created and fails with ENOMEM.

   // Create separate thread for polling values
   pthread_t matter_crl_thread;
   esp_pthread_cfg_t esp_pthread_cfg;
   int thread_res;

   // Use the ESP-IDF API to change the default thread attributes
   //esp_pthread_cfg = esp_pthread_get_default_config();
   //esp_pthread_cfg.stack_size = 8000;
   //esp_pthread_cfg.prio += 2;

   esp_err_t err = esp_pthread_set_cfg(&esp_pthread_cfg);
   if (err != ESP_OK)
   {
        ESP_LOGE(TAGD, "Failed to set thread config, err: %d", err);
   }

   thread_res = pthread_create(&matter_crl_thread, NULL, matter_ctrl_func, NULL);
   if (thread_res != 0)
   {
        ESP_LOGI(TAGD, "ERROR! UNABLE TO CREATE MATTER CONTROL THREAD  : %d", thread_res);
        return;
   }

   ESP_LOGI(TAGD, "Created matter_crl_thread 0x%lu with new default config\n", matter_crl_thread);

Any recommendations on how to solve this?

Also I saw it mentioned elsewhere to use DeviceLayer::PlatformMgr().ScheduleWork() instead.

Are there any examples of how to use that for commissioning? Thanks in advance.

jonsmirl commented 4 weeks ago

Do you have PSRAM? I doubt if a real world controller app is going to fit on S3 without extra PSRAM. The minimum 2MB is plenty.

You also need to be careful about calling recursively into CHIP and eating up huge amounts of stack. I am using messages and queues in FreeRTOS to allow the stack to unroll.

wqx6 commented 4 weeks ago

Did you also run Thread BR on ESP32-S3? If you enable Thread BR on ESP32-S3 with the matter controller feature, we suggest you add PSRAM for S3. Note that we are calling the pairing_code_thread in CHIP task in our example, you can also post it to CHIP task.

pavel808 commented 4 weeks ago

Do you have PSRAM? I doubt if a real world controller app is going to fit on S3 without extra PSRAM. The minimum 2MB is plenty.

You also need to be careful about calling recursively into CHIP and eating up huge amounts of stack. I am using messages and queues in FreeRTOS to allow the stack to unroll.

Hi @jonsmirl It appears that S3 has PSRAM. I am developing on the ESP Thread BR device. Quad PSRAM had already been enabled according to the menuconfig settings. Any specific settings I should apply for PSRAM that may help?

Do you have some examples of using messages and queues in FreeRTOS for this, instead of recursively calling into CHIP to commission the device? Thanks.

pavel808 commented 4 weeks ago

Did you also run Thread BR on ESP32-S3? If you enable Thread BR on ESP32-S3 with the matter controller feature, we suggest you add PSRAM for S3. Note that we are calling the pairing_code_thread in CHIP task in our example, you can also post it to CHIP task.

Hi @wqx6 Yes I am also running Thread BR on S3. I clearly need PSRAM. How do I enable this?

What's the best way to post this to CHIP task as you mentioned? I couldn't find examples. Thanks.

jonsmirl commented 4 weeks ago

Make sure the PSRAM is enabled in menuconfig. If is enabled correctly the bootlog will print the size of it when the chip first boots.

Then enable these: CONFIG_BT_NIMBLE_MEM_ALLOC_MODE_EXTERNAL=y CONFIG_ESP_MATTER_MEM_ALLOC_MODE_EXTERNAL=y CONFIG_NIMBLE_MEM_ALLOC_MODE_EXTERNAL=y

pavel808 commented 4 weeks ago

Make sure the PSRAM is enabled in menuconfig. If is enabled correctly the bootlog will print the size of it when the chip first boots.

Then enable these: CONFIG_BT_NIMBLE_MEM_ALLOC_MODE_EXTERNAL=y CONFIG_ESP_MATTER_MEM_ALLOC_MODE_EXTERNAL=y CONFIG_NIMBLE_MEM_ALLOC_MODE_EXTERNAL=y

@jonsmirl Ah yes. PSRAM shows 2MB in the bootlog. I enabled those then as you mentioned, and now I no longer get a stack overflow crash on calling pairing_code_thread . Thanks.

However, commissioning fails no matter what with Error on commissioning step 'ThreadNetworkSetup': 'Error CHIP:0x000000AC', but that's a different story of course.