pycom / pycom-micropython-sigfox

A fork of MicroPython with the ESP32 port customized to run on Pycom's IoT multi-network modules.
MIT License
199 stars 167 forks source link

v1.20.0.rc7: BLE causes crash and file corruption #265

Closed chriskoz closed 5 years ago

chriskoz commented 5 years ago

Wipy crashes and may corrupt the file system when trying to write a file after enabling BLE.

Pycom module: WiPy 3.0 Firmware: v1.20.0.rc7 File system: Both FatFS and LittleFS

I have reduced the issue down to the following simple example main.py:

import utime
from network import Bluetooth

print("create file...")
with open('/flash/test.txt', 'w+') as f:
    f.write('some data')

print("setup BLE...")
bluetooth = Bluetooth()
bluetooth.set_advertisement(name = "BleTest")
bluetooth.advertise(True)

print("write data...")
writeCount = 0
while True:
    utime.sleep(1)
    writeCount = writeCount + 1
    print("count: "+str(writeCount))
    with open('/flash/test.txt', 'w') as f:
        f.write('some data')

The above main.py code works using firmware versions 1.18.x. But with version 1.20.x (or 1.19.x) it will cause a crash dump on the first or second write attempt after enabling the BLE. Simply remove the bluetooth.advertise(True) line and the file writing works fine on 1.20.x.

After the crash, FatFS file systems are often left corrupted. Some files may be missing and there may be a bunch "junk" folders. Often have to re-flash the firmware and erase the flash to fix this issue as further attempts to write to the file system cause more crashes. The crash occurs using either FatFS or LittleFS. However, LittleFS appears to be less susceptible to the file corruption.

I'm really looking forward to some of the improvements in firmware version 1.20.x. Support for light sleep. Improved power consumption. Improved framebuf support. etc. But this firmware is unusable if I can't write to files while using BLE.

iwahdan88 commented 5 years ago

Hello @chriskoz , This crash is happens when BLE tries to save data in cache when going to sleep at some point after initialisation and advertisement while cache area is Locked due to Flash access by Filesystem function calls. We are still working on fixing that issue However there is a workaround you can do by disabling bluetooth modem sleep bluetooth = Bluetooth(modem_sleep=False) this option was just recently added to release candidate branch here on another note: Light sleep support is already available in 1.20.0 release candidate, you enter this mode via machine.sleep(<time in ms>, <True/False to enable Wifi/BT connection resume>)

chriskoz commented 5 years ago

Hi @iwahdan88,

I finally got a chance to build and flash the RC firmware and try the "modem_sleep=False" BLE option. This does indeed fix the file corruption issues for me. Thank you. I can now begin working with the newer 1.20.x builds.

Am I correct to assume using this option will cause greater power consumption from the BLE radio/system? Are there any other impacts I should be aware of using this option? For instance... does it impact the CPU sleep modes in general?

iwahdan88 commented 5 years ago

@chriskoz , yes this option will prevent the Bluetooth modem from going to sleep thus consuming more power. that should be the only impact for that option No it does not impact CPU sleep modes. bluetooth/Wifi modem is turned off before going to either Light or Deep sleep.

alexodus commented 4 years ago

the script of @chriskoz (with modem_sleep=False) fail also with 1.20.1.r1

Pycom module: WiPy 3.0 Firmware: v1.20.1.r1 File system: FatFS

amotl commented 4 years ago

Dear @chriskoz and @alexodus,

LittleFS is definitely the right way to go in order to prevent file system corruption coming from brownouts and core panics.

If you feel lucky, you might want to try our custom build we've just released yesterday [1]. More details about this is available through [2].

If we are lucky together, this will improve the stability significantly. If you will be still receiving the core dumps, I will be happy if you would share its content with us.

Please be aware that you will have to erase your device completely before flashing in order to keep things straight. You will find respective references to this on the forum. Hint: Use pycom-fwtool-cli --port /dev/ttyUSB0 erase_all, see also [3].

As it turned out to gain more robustness for others already [4,5], we will be happy to learn if this happens to you as well.

With kind regards, Andreas.

cc @emmanuel-florent

[1] https://packages.hiveeyes.org/hiveeyes/foss/pycom/vanilla/WiPy-1.20.1.r1-0.6.0-vanilla-dragonfly.tar.gz [2] https://community.hiveeyes.org/t/investigating-random-core-panics-on-pycom-esp32-devices/2480 [3] https://community.hiveeyes.org/t/installing-the-recent-pycom-firmware-1-20-1-r1-requires-erasing-the-flash-memory-completely/2688 [4] https://community.hiveeyes.org/t/testing-the-custom-dragonfly-builds-on-pycom-devices/2746 [5] https://github.com/pycom/pycom-micropython-sigfox/issues/361#issuecomment-553399627

amotl commented 4 years ago

Dear @chriskoz, @alexodus and @emmanuel-florent,

Exact steps to cause this issue

Wipy crashes and may corrupt the file system when trying to write a file after enabling BLE.

Several people have been able to mitigate similar issues (not even related to BLE) with one of our custom/unofficial Dragonfly builds based on Pycom’s 1.20.1.r1 [1] as mentioned above. We've refreshed this the other day and published corresponding Squirrel builds based on Pycom’s most recent 1.20.2.rc3 [2].

If you are lucky to try this build, we will be happy to hear about the outcome for you.

With kind regards, Andreas.

[1] https://community.hiveeyes.org/t/dragonfly-firmware-for-pycom-esp32/2746 [2] https://community.hiveeyes.org/t/squirrel-firmware-for-pycom-esp32/2960

chriskoz commented 3 years ago

I was finally able to get back to looking at this issue today. I tried the test code on firmware version 1.20.2.r2 and I am NOT seeing the crash anymore.