espressif / esp-idf

Espressif IoT Development Framework. Official development framework for Espressif SoCs.
Apache License 2.0
13.27k stars 7.2k forks source link

Writing to SD card fails in diskio_sdmmc: sdmmc_erase_sectors failed (262) (IDFGH-7094) #8704

Closed zinke-ct-video closed 2 years ago

zinke-ct-video commented 2 years ago

Environment

Problem Description

TL;DR

File write operations on SD cards having DISCARD or FULE support fail. This bug was introduced with the commits 964592 and ffdbee

In-depth explanation

My ESP32 is connected to an SD card. The ESP32 performs lots of read and write operations on files on the SD card. Before the commits mentioned above came in, everything was fine. After these commits came in, I noticed that on some types of SD cards errors occur during file write operations. The following error message appears on the serial log output for every failed operation: E (13146) diskio_sdmmc: sdmmc_erase_sectors failed (262)

I further investigated the bug and found that this error occurs only on some types of SD cards. To find the difference between the cards I wrote the following code to dump the SD card registers:

    sdmmc_host_t host = SDMMC_HOST_DEFAULT();
    sdmmc_slot_config_t slot_config = SDMMC_SLOT_CONFIG_DEFAULT();
    esp_vfs_fat_sdmmc_mount_config_t mount_config = {
        .format_if_mount_failed = false,
        .max_files = 10,
        .allocation_unit_size = 0 // auto / equals sector size of 512 bytes
    };
    sdmmc_card_t* card;
    ESP_ERROR_CHECK(esp_vfs_fat_sdmmc_mount("/sd", &host, &slot_config, &mount_config, &card));

    ESP_LOGI(TAG, "SD card info:");
    ESP_LOGI(TAG, "\tBus width (log2): %d", card->log_bus_width);
    ESP_LOGI(TAG, "\tFreq (kHz): %d", card->max_freq_khz);
    ESP_LOGI(TAG, "\tDDR: %d", card->is_ddr);
    ESP_LOGI(TAG, "\tCID: Date %d, MFG_ID %d, Name %s, OEM ID %d, Rev %d, Serial %d", card->cid.date, card->cid.mfg_id, card->cid.name, card->cid.oem_id, card->cid.revision, card->cid.serial);
    ESP_LOGI(TAG, "\tCSD: Capacity %d, Card Common Class %d, CSD version %d, MMC version %d, read block len %d, sector size %d, tr speed %d", card->csd.capacity, card->csd.card_command_class, card->csd.csd_ver, card->csd.mmc_ver, card->csd.read_block_len, card->csd.sector_size, card->csd.tr_speed);
    ESP_LOGI(TAG, "\tCSD: Ease mem state %d, Power class %d, Revision %d, Sec feature %d", card->ext_csd.erase_mem_state, card->ext_csd.power_class, card->ext_csd.rev, card->ext_csd.sec_feature);
    ESP_LOGI(TAG, "\tSCR: bus width %d, erase mem state %d, reserved %d, rsvd_mnf %d, sd_spec %d", card->scr.bus_width, card->scr.erase_mem_state, card->scr.reserved, card->scr.rsvd_mnf, card->scr.sd_spec);
    ESP_LOGI(TAG, "\tSSR: cur_bus_width %d, discard_support %d, fule_support %d, reserved %d", card->ssr.cur_bus_width, card->ssr.discard_support, card->ssr.fule_support, card->ssr.reserved);

I tested two types of SD cards, the SanDisk Ultra doesn't work, the Samsung EVO works fine. Here the is the log output for both cards:

SanDisk Ultra 32GB Class 10:

I (2247) SD Card: SD card info:
I (2247) SD Card:       Bus width (log2): 2
I (2247) SD Card:       Freq (kHz): 40000
I (2267) SD Card:       DDR: 0
I (2277) SD Card:       CID: Date 347, MFG_ID 3, Name SD32G, OEM ID 21316, Rev 133, Serial 1488661747
I (2297) SD Card:       CSD: Capacity 62333952, Card Common Class 1461, CSD version 1, MMC version 0, read block len 9, sector size 512, tr speed 50000000
I (2347) SD Card:       CSD: Ease mem state 0, Power class 0, Revision 0, Sec feature 0
I (2367) SD Card:       SCR: bus width 5, erase mem state 0, reserved 0, rsvd_mnf 0, sd_spec 2
I (2397) SD Card:       SSR: cur_bus_width 2, discard_support 1, fule_support 1, reserved 0

Samsung EVO 32GB:

I (2057) SD Card: SD card info:
I (2057) SD Card:       Bus width (log2): 2
I (2067) SD Card:       Freq (kHz): 40000
I (2077) SD Card:       DDR: 0
I (2087) SD Card:       CID: Date 290, MFG_ID 27, Name EB1QT, OEM ID 21325, Rev 48, Serial 1390302638
I (2117) SD Card:       CSD: Capacity 62521344, Card Common Class 1461, CSD version 1, MMC version 0, read block len 9, sector size 512, tr speed 50000000
I (2157) SD Card:       CSD: Ease mem state 0, Power class 0, Revision 0, Sec feature 0
I (2187) SD Card:       SCR: bus width 5, erase mem state 1, reserved 0, rsvd_mnf 0, sd_spec 2
I (2207) SD Card:       SSR: cur_bus_width 2, discard_support 0, fule_support 0, reserved 0

As you can see, the only significant differences are card->ssr.discard_support and card->ssr.fule_support.

Next, I have checked if I can make the error disappear if I disable these flags on the SanDisk Ultra card manually. To do so, I added the following two lines.

    card->ssr.discard_support = 0;
    card->ssr.fule_support = 0;

Doing so the SanDisk Ultra card with discard and fule support works fine now, too.

The last test I did was disabling TRIM support in the fatfs component. So let's go back and remove the above workaround that disabled discard_support and fule_support. Then go to the file esp-idf/components/fatfs/src/ffconf.h and change the line 238 from #define FF_USE_TRIM 1 to #define FF_USE_TRIM 0. Also this step makes the error disappear and both cards work fine.

So I conclude that there must be an issue with the newly introduced function sdmmc_erase_sectors in the file esp-idf/components/fatfs/diskio/diskio_sdmmc.c

Expected Behavior

All file I/O should succeed even with cards that support DISCARD and FULE.

vamshi51 commented 2 years ago

Hi @zinke-ct-video, thanks for reporting this. I am able to reproduce this behavior. Error code 262 signifies ESP_ERR_NOT_SUPPORTED. This error is reported from line 520 of the file components/sdmmc/sdmmc_cmd.c

I see there is a bug in, validating the CMD38 argument. SDMMC_MMC_TRIM_ARG and SDMMC_SD_DISCARD_ARG have the same value 1 causing the wrong condition check and resulting in ESP_ERR_NOT_SUPPORTED.

You have rightly figured out, disabling FF_USE_TRIM does not invoke sdmmc_erase_sectors

I assure, that the error message on the serial log output: E (13146) diskio_sdmmc: sdmmc_erase_sectors failed (262) has no adverse effect on the filesystem or the data integrity.

Please help to replace lines 519-525 of components/sdmmc/sdmmc_cmd.c (_sdmmc_erasesectors) with the folowing snippet and share your observation with SanDisk Ultra 32GB Class 10

if (card->is_mmc) {
      if ((arg == SDMMC_MMC_TRIM_ARG) && (sdmmc_can_trim(card) != ESP_OK)) {
         return ESP_ERR_NOT_SUPPORTED;
      }
      if ((arg == SDMMC_MMC_DISCARD_ARG) && (sdmmc_can_discard(card) != ESP_OK)) {
         return ESP_ERR_NOT_SUPPORTED;
      }
  } else {
      if ((arg == SDMMC_SD_DISCARD_ARG) && (sdmmc_can_discard(card) != ESP_OK)) {
          return ESP_ERR_NOT_SUPPORTED;
      }
  }

@igrr , I see we can replace the line-80 of components/fatfs/diskio/diskio_sdmmc.c (information instead of errror to avoid panic) ESP_LOGE(TAG, "sdmmc_erase_sectors failed (%d)", err); as ESP_LOGI(TAG, "sdmmc_erase_sectors returned (%d)", err);

zinke-ct-video commented 2 years ago

Good to hear that it doesn't affect file system integrity. Just out of curiosity ... What's the purpose of these TRIM and DISCARD commands if file system is also fully functional without them? Performance improvement?

As requested I replaced lines 519-525 of components/sdmmc/sdmmc_cmd.c (sdmmc_erase_sectors) with the code from your response and tested the SanDisk Ultra 32GB Class 10 again. The error doesn't occur anymore.

Thank's a lot for the quick reply and fix!

vamshi51 commented 2 years ago

@zinke-ct-video thanks for the confirmation. TRIM/DISCARD will let the FTL (Flash Translation Layer) of the card to mark the sectors for erasing so that GC (garbage colleciton)can erase these sectors in the background to improve the WRITE performance (implementation is specific to card vendor). With this, WRITE operation need not wait for ERASE to complete in certain cases.

I understand Samsung EVO 32GB is working as expected, please confirm.

zinke-ct-video commented 2 years ago

Thanks for the explanation.

Yes, I can confirm this. Both card types I am using are working as expected with your patch.

Vincenzo1992 commented 9 months ago

Can you say where you have added the : card->ssr.discard_support = 0; card->ssr.fule_support = 0;?

zinke-ct-video commented 9 months ago

I think I added this line after the call to the esp_vfs_fat_sdmmc_mount(...) function. But I want to remind you, that commit 0e366db6 fixes the issue. This means that there is no need to use the workaround anymore unless you have any serious reason not to update your ESP-IDF.

Vincenzo1992 commented 9 months ago

I think I added this line after the call to the esp_vfs_fat_sdmmc_mount(...) function. But I want to remind you, that commit 0e366db fixes the issue. This means that there is no need to use the workaround anymore unless you have any serious reason not to update your ESP-IDF.

I know, it has been fixed, however, i am using 128 GB sd card which apparently works fine with the file system just isn't able to be formatted when using SDMMC(currently i am formatting with SPI then switching to MMC for speed) :)

zinke-ct-video commented 9 months ago

Okay, this sounds like another bug. Maybe you should open another issue to get help formatting the 128 GB SD card?