Open atlascoder opened 2 years ago
Hi @atlascoder, thanks for reporting the issue!
Using the SD is problematic, for example, SD card hangs after writing 512 files into directory and requires to repower it and format in order to use it again.
Do you happen to be creating files in the root directory of the card? 512 sounds like the limit of the number of entries in the FAT root directory. https://github.com/espressif/esp-idf/blob/187e6ff66f653ac21fb6ab67e3cc8cc3a571b032/components/fatfs/src/ff.c#L5540 Although it's a bit odd that you would see this limitation — on FAT32 the root directory content is stored in the data region, so the limit should only apply to FAT16. And I assume that any SD card these days is too large for FAT16. (Edit: I stand corrected, a 2GB card could be formatted in FAT16) Perhaps you could make a dump of the filesystem from your card when the issue happens, and check the FAT structure with something like https://github.com/maxpat78/FATtools?
Regarding the crash in f_rename/unlock_fs functions, the assertion in FreeRTOS happens because the mutex pointer being passed to xQueueGenericSend
is NULL. If you have a GDB Stub session, you can check whether this is the case — please look for s_fat_ctxs[0].fs->sobj
. I see two possibilities how sobj
could become NULL:
rotate_file
in your case) was in progress. esp_vfs_fat_unregister_path
function is not thread-safe, so if for any reason it is called while another FatFS operation is in progress, something like what you observe may happen.sobj
member of struct FATFS
. This is possible since sobj
is right after lfnbuf
which is used by some string functions, so a buffer overflow can't be ruled out. You can try the following to check this: add some dummy entry in struct FATFS
(in ff.h) right before sobj
. For example, uint32_t unused[32];
. If the issue stops being reproducible, then it means that out-of-bounds write was indeed the cause.I've also noticed that the backtrace refers to fsu_fatfs
component in your project directory - DevProjects/fleetsu/gx5s-poc-v2/components/fsu_fatfs
.
Is the issue reproducible with fatfs
component in ESP-IDF?
If not, could you describe the changes between your version of fatfs
and the one in ESP-IDF? A diff file would be the best option.
Hi, @igrr , thank you very much for so quick and comprehensive response!
Regarding the 512 limit - you are right, this must be only root folder limit. Though, I checked this before and faced 512 limit as well, but now, I made another check and I can write more that 512 files to non-root folder.
I have tested on a card with that was perviousely formatted in exFAT but 512 limit was also active. When I format the SD card by internal FatFS routne - it becomes Fat16 format.
Thank you for the clarification about this limit!
Regarding the crash - your recommendations are very close to my thoughts! I suspected that there is some memory corruption because of some non thread safe routine. I have another task in parallel that operates with file intensively. But I am not sure at the moment that this is esp_vfs_fat_unregister_path
, but I will check. As well as your brilliant advice how to check lfnbuf
overflow.
I'll post results later.
Thank you very much!!
Ah, @igrr , one detail I left aside but in your perspective it may look important. Before the crashes I usually observe erroneous responses from SDSPI driver. I understant, this info is not enough for analysis, but how do you think: can SPI driver make such memory corruption? For example, in a case when SD card stops to reply SD requests.
Environment
Problem Description
I have a board with SD chip conneted to it vis SPI bus using the pins:
define PIN_NUM_MISO GPIO_NUM_16
define PIN_NUM_MOSI GPIO_NUM_4
define PIN_NUM_CLK GPIO_NUM_5
define PIN_NUM_CS GPIO_NUM_18
define PIN_NUM_POW GPIO_NUM_2
Using the SD is problematic, for example, SD card hangs after writing 512 files into directory and requires to repower it and format in order to use it again. I tested with other SD cards - problem is the same. I suspect an issue in FatFS component. I could accept this if there would be a workaround that allowed to remount and format the SD card somehou without system crash or restart - this is critical. But I have no luck to get this.
The problem is that after any failed FatFS operation - system crashes on the next one. It panics on assert in
freertos/queue.c
:The system is not amn isolated example - the problem with asserts appears in the system with several tasks with pretty intensive operations. But the issue with FatFS limit of 512 file was confirmed on an isolated FW example (I can share it if needed).
Expected Behavior
System should not crash after any FatFS failure, allowing the system to handle the error somehow.
Actual Behavior
System crashes after failing FatFS invocation - unpon next execution:
crashes on execution
remove(name1)
- it doesn't appear on an isolated example!Steps to reproduce
Can't share publicly.
Code to reproduce this issue
Debug Logs
Other items if possible
build
folder (note this may contain all the code details and symbols of your project.)