meshtastic / firmware

Meshtastic device firmware
https://meshtastic.org
GNU General Public License v3.0
3.49k stars 863 forks source link

[Bug]: Factory reset on 1.3 causes problems #1597

Closed ajorg closed 2 years ago

ajorg commented 2 years ago

Category

Other

Hardware

T-Beam, T-Echo

Firmware Version

1.3.30.9fe2ddb

Description

I've observed two different problems after asking 1.3.30 to factory reset (meshtastic --set device.factory_reset 1). On T-Echo, a factory reset will cause it to bootloop. Downgrading to 1.2.25, setting the region, and then upgrading to 1.3.30 again works around the issue. On T-Beam, a factory reset causes the device to be unable to communicate via serial or bluetooth. In both cases the error received is that the firmware is the wrong version and needs to be upgraded. Re-installing the firmware with device-install.sh works around the issue. These are the only devices I have, so I can't verify what happens elsewhere.

I'm unable to get any serial logging from either device when in the corrupted state.

Relevant log output

No response

caveman99 commented 2 years ago

Problem is, without a repo or a serial log this is near to impossible to debug. I have tried your steps and can not reproduce it on first try.

ajorg commented 2 years ago

I'll see what I can do.

Waffle2 commented 2 years ago

I've seen this on the TLORA board as well. Reflashing it solved the issue, as you indicate.

garthvh commented 2 years ago

@caveman99 I get weird behavior on every factory reset with every device, on NRF it does not seem to clear any settings with 1.3.30, on a tbeam it makes the device really unstable and tlora devices are unstable but don't boot loop.

garthvh commented 2 years ago

RAK 4631

??:??:?? 6 Installing default DeviceState ??:??:?? 6 Initial packet id 868438243, numPacketId 4294967295 ??:??:?? 6 Saving /prefs/db.proto ??:??:?? 7 Saving /prefs/config.proto ??:??:?? 9 Saving /prefs/module.proto ??:??:?? 10 Saving /prefs/channels.proto ??:??:?? 12 Clearing bluetooth bonds!

??:??:?? 14 Expanding short PSK #1 ??:??:?? 14 Wanted region 1, using US ??:??:?? 14 region=1, NODENUM=0x47d2c6c4, dbsize=0 ??:??:?? 14 Saving /prefs/db.proto ??:??:?? 16 Saving /prefs/config.proto ??:??:?? 17 Saving /prefs/module.proto ??:??:?? 19 Saving /prefs/channels.proto ??:??:?? 20 Using analog input 5 for battery level ??:??:?? 20 Using MSL altitude model ??:??:?? 20 WANT GPS=1 ??:??:?? 21 GxGSA NOT available ??:??:?? 21 GPIO initialized (10, 4, 3) ??:??:?? 21 No /prefs/cannedConf.proto preferences found ??:??:?? 22 CannedMessageModule: No messages are configured. Module is disabled ??:??:?? 22 No /prefs/oem.proto preferences found ??:??:?? 22 Turning on screen ??:??:?? 22 Module wants a UI Frame ??:??:?? 22 GPS FactoryReset requested ??:??:?? 24 Saving /prefs/db.proto ??:??:?? 25 Saving /prefs/config.proto ??:??:?? 27 Saving /prefs/module.proto ??:??:?? 29 Saving /prefs/channels.proto ??:??:?? 30 Starting meshradio init... ??:??:?? 30 Set radio: region=US, name=LongF, config=0, ch=91, power=30 ??:??:?? 30 Radio myRegion->freqStart / myRegion->freqEnd: 902.000000 -> 928.000000 (26.000000 mhz) ??:??:?? 30 Radio myRegion->numChannels: 104 ??:??:?? 30 Radio channel_num: 91 ??:??:?? 30 Radio frequency: 924.875000 ??:??:?? 30 Slot time: 42 msec ??:??:?? 30 Set radio: final power level=22 ??:??:?? 30 SX126x init result 0 ??:??:?? 30 Frequency set to 924.875000 ??:??:?? 30 Bandwidth set to 250.000000 ??:??:?? 30 Power output set to 22 ??:??:?? 30 Current limit set to 140.000000 ??:??:?? 30 Current limit set result 0 ??:??:?? 31 SX1262 Radio init succeeded, using SX1262 radio ??:??:?? 31 (bw=250, sf=11, cr=4/8) packet symLen=8 ms, payloadSize=237, time 3188 ms ??:??:?? 31 myNodeInfo.bitrate = 74.341286 bytes / sec ??:??:?? 31 PowerFSM init, USB power=1 ??:??:?? 31 Enter state: BOOT ??:??:?? 31 [Power] Battery: usbPower=0, isCharging=0, batMv=3883, batPct=65 ??:??:?? 31 [Screen] Done with boot screen... ??:??:?? 31 [Screen] Screen: Started... ??:??:?? 31 [Screen] showing standard frames ??:??:?? 31 [Screen] Module wants a UI Frame ??:??:?? 31 [Screen] Showing 1 module frames ??:??:?? 31 [Screen] Total frame count: 84 ??:??:?? 31 [Screen] Added modules. numframes: 1 ??:??:?? 31 [Screen] Finished building frames. numframes: 3 ??:??:?? 31 [Screen] Setting fast framerate ??:??:?? 31 [Screen] Setting idle framerate ??:??:?? 31 [DeviceTelemetryModule] Device Telemetry: Initializing ??:??:?? 31 [DeviceTelemetryModule] ----------------------------------------- ??:??:?? 31 [DeviceTelemetryModule] Device Telemetry: Read data ??:??:?? 31 [DeviceTelemetryModule] Telemetry->time: 31 ??:??:?? 31 [DeviceTelemetryModule] Telemetry->air_util_tx: 0.000000 ??:??:?? 31 [DeviceTelemetryModule] Telemetry->battery_level: 65 ??:??:?? 31 [DeviceTelemetryModule] Telemetry->channel_utilization: 0.000000 ??:??:?? 32 [DeviceTelemetryModule] Telemetry->voltage: 3.883000 ??:??:?? 32 [DeviceTelemetryModule] Device Telemetry: Sending packet to mesh ??:??:?? 32 [DeviceTelemetryModule] Update DB node 0x47d2c6c4, rx_time=0 ??:??:?? 32 [DeviceTelemetryModule] handleReceived(LOCAL) (id=0x33c350e6 Fr0xc4 To0xff, WantAck0, HopLim3 Ch0x0 Portnum=67) ??:??:?? 32 [DeviceTelemetryModule] No modules interested in portnum=67, src=LOCAL ??:??:?? 32 [DeviceTelemetryModule] Add packet record (id=0x33c350e6 Fr0xc4 To0xff, WantAck0, HopLim3 Ch0x0 Portnum=67) ??:??:?? 32 [DeviceTelemetryModule] Expanding short PSK #1 ??:??:?? 32 [DeviceTelemetryModule] Using AES128 key! ??:??:?? 32 [DeviceTelemetryModule] nRF52 encrypt fr=47d2c6c4, num=33c350e6, numBytes=18! ??:??:?? 32 [DeviceTelemetryModule] enqueuing for send (id=0x33c350e6 Fr0xc4 To0xff, WantAck0, HopLim3 Ch0x6e encrypted) ??:??:?? 32 [DeviceTelemetryModule] txGood=0,rxGood=0,rxBad=0 ??:??:?? 32 [DeviceTelemetryModule] Using channel 0 (hash 0x6e) ??:??:?? 32 [DeviceTelemetryModule] Expanding short PSK #1 ??:??:?? 32 [DeviceTelemetryModule] Using AES128 key! ??:??:?? 32 [DeviceTelemetryModule] nRF52 encrypt fr=47d2c6c4, num=33c350e6, numBytes=18! ??:??:?? 32 [DeviceTelemetryModule] decoded message (id=0x33c350e6 Fr0xc4 To0xff, WantAck0, HopLim3 Ch0x0 Portnum=67 priority=64) ??:??:?? 33 [DeviceTelemetryModule] updateTelemetry LOCAL ??:??:?? 33 [DeviceTelemetryModule] Node status update: 1 online, 1 total ??:??:?? 33 [DeviceTelemetryModule] showing standard frames ??:??:?? 33 [DeviceTelemetryModule] Module wants a UI Frame ??:??:?? 33 [DeviceTelemetryModule] Showing 1 module frames ??:??:?? 33 [DeviceTelemetryModule] Total frame count: 84 ??:??:?? 33 [DeviceTelemetryModule] Added modules. numframes: 1 ??:??:?? 33 [DeviceTelemetryModule] Finished building frames. numframes: 3 ??:??:?? 33 [DeviceTelemetryModule] Setting fast framerate ??:??:?? 33 [Screen] Setting idle framerate ??:??:?? 33 [RadioIf] Starting low level send (id=0x33c350e6 Fr0xc4 To0xff, WantAck0, HopLim3 Ch0x6e encrypted priority=64) ??:??:?? 33 [RadioIf] (bw=250, sf=11, cr=4/8) packet symLen=8 ms, payloadSize=34, time 772 ms ??:??:?? 33 [RadioIf] AirTime - Packet transmitted : 772ms ??:??:?? 33 [GPS] SOME data is TOO OLD: LOC 4294967295, TIME 4294967295, DATE 4294967295 ??:??:?? 33 [GPS] SOME data is TOO OLD: LOC 4294967295, TIME 4294967295, DATE 4294967295 ??:??:?? 33 [GPS] publishing pos@0:2, hasVal=0, GPSlock=1 ??:??:?? 33 [GPS] No GPS lock ??:??:?? 33 [GPS] onGPSChanged() pos@0, time=0, lat=0, lon=0, alt=0 ??:??:?? 34 [GPS] updatePosition LOCAL pos@0, time=0, latI=0, lonI=0, alt=0 ??:??:?? 34 [GPS] Node status update: 1 online, 1 total ??:??:?? 34 [GPS] SOME data is TOO OLD: LOC 4294967295, TIME 4294967295, DATE 4294967295 ??:??:?? 34 [PowerFSM] Enter state: POWER ??:??:?? 34 [PowerFSM] Loss of power in Powered ??:??:?? 34 [PowerFSM] Loss of power in Powered

The device crashes when it gets to Loss of power in Powered, device has a charged battery and is connected via usb

caveman99 commented 2 years ago

On the road for the weekend. Does that also happen with canned message module disabled?

garthvh commented 2 years ago

Here it is with no canned messages 4631 Rak with buzzer, GPS and BME680

WARNING file:mesh_interface.py _sendToRadio line:467 Not sending packet because protocol use is disabled by noProto Connected to radio WARNING file:mesh_interface.py _sendPacket line:357 Not sending packet because protocol use is disabled by noProto ??:??:?? 7 Saving /prefs/module.proto ??:??:?? 8 Saving /prefs/channels.proto ??:??:?? 10 Clearing bluetooth bonds!

??:??:?? 13 Expanding short PSK #1 ??:??:?? 13 Wanted region 0, using Unset ??:??:?? 13 region=0, NODENUM=0x47d2c6c4, dbsize=0 ??:??:?? 13 Saving /prefs/db.proto ??:??:?? 15 Saving /prefs/config.proto ??:??:?? 17 Saving /prefs/module.proto ??:??:?? 19 Saving /prefs/channels.proto ??:??:?? 21 Using analog input 5 for battery level ??:??:?? 21 Using MSL altitude model ??:??:?? 21 WANT GPS=1 ??:??:?? 22 WARNING: Unable to enable NMEA Mode. ??:??:?? 22 GxGSA NOT available ??:??:?? 22 No /prefs/oem.proto preferences found ??:??:?? 23 Turning on screen ??:??:?? 23 Module wants a UI Frame ??:??:?? 23 GPS FactoryReset requested ??:??:?? 25 Saving /prefs/db.proto ??:??:?? 27 Saving /prefs/config.proto ??:??:?? 29 Saving /prefs/module.proto ??:??:?? 31 Saving /prefs/channels.proto ??:??:?? 33 Starting meshradio init... ??:??:?? 33 Set radio: region=Unset, name=LongF, config=0, ch=91, power=30 ??:??:?? 33 Radio myRegion->freqStart / myRegion->freqEnd: 902.000000 -> 928.000000 (26.000000 mhz) ??:??:?? 33 Radio myRegion->numChannels: 104 ??:??:?? 33 Radio channel_num: 91 ??:??:?? 33 Radio frequency: 924.875000 ??:??:?? 33 Slot time: 42 msec ??:??:?? 33 Set radio: final power level=22 ??:??:?? 33 SX126x init result 0 ??:??:?? 33 Frequency set to 924.875000 ??:??:?? 33 Bandwidth set to 250.000000 ??:??:?? 33 Power output set to 22 ??:??:?? 33 Current limit set to 140.000000 ??:??:?? 33 Current limit set result 0 ??:??:?? 33 SX1262 Radio init succeeded, using SX1262 radio ??:??:?? 33 (bw=250, sf=11, cr=4/8) packet symLen=8 ms, payloadSize=237, time 3188 ms ??:??:?? 33 myNodeInfo.bitrate = 74.341286 bytes / sec ??:??:?? 33 PowerFSM init, USB power=1 ??:??:?? 34 Enter state: BOOT ??:??:?? 34 [Power] Battery: usbPower=1, isCharging=1, batMv=4348, batPct=100 ??:??:?? 34 [Screen] Done with boot screen... ??:??:?? 34 [Screen] Screen: Started... ??:??:?? 34 [Screen] showing standard frames ??:??:?? 34 [Screen] Showing 0 module frames ??:??:?? 34 [Screen] Total frame count: 83 ??:??:?? 34 [Screen] Added modules. numframes: 0 ??:??:?? 34 [Screen] Finished building frames. numframes: 2 ??:??:?? 34 [Screen] Setting fast framerate ??:??:?? 34 [Screen] Setting idle framerate ??:??:?? 34 [DeviceTelemetryModule] Device Telemetry: Initializing ??:??:?? 34 [DeviceTelemetryModule] ----------------------------------------- ??:??:?? 34 [DeviceTelemetryModule] Device Telemetry: Read data ??:??:?? 35 [DeviceTelemetryModule] Telemetry->time: 34 ??:??:?? 35 [DeviceTelemetryModule] Telemetry->air_util_tx: 0.000000 ??:??:?? 35 [DeviceTelemetryModule] Telemetry->battery_level: 100 ??:??:?? 35 [DeviceTelemetryModule] Telemetry->channel_utilization: 0.000000 ??:??:?? 35 [DeviceTelemetryModule] Telemetry->voltage: 4.348000 ??:??:?? 35 [DeviceTelemetryModule] Device Telemetry: Sending packet to mesh ??:??:?? 35 [DeviceTelemetryModule] Update DB node 0x47d2c6c4, rx_time=0 ??:??:?? 35 [DeviceTelemetryModule] handleReceived(LOCAL) (id=0x6b09018c Fr0xc4 To0xff, WantAck0, HopLim3 Ch0x0 Portnum=67) ??:??:?? 35 [DeviceTelemetryModule] No modules interested in portnum=67, src=LOCAL ??:??:?? 35 [DeviceTelemetryModule] Add packet record (id=0x6b09018c Fr0xc4 To0xff, WantAck0, HopLim3 Ch0x0 Portnum=67) ??:??:?? 35 [DeviceTelemetryModule] Expanding short PSK #1 ??:??:?? 35 [DeviceTelemetryModule] Using AES128 key! ??:??:?? 35 [DeviceTelemetryModule] nRF52 encrypt fr=47d2c6c4, num=6b09018c, numBytes=18! ??:??:?? 35 [DeviceTelemetryModule] enqueuing for send (id=0x6b09018c Fr0xc4 To0xff, WantAck0, HopLim3 Ch0x6e encrypted) ??:??:?? 35 [DeviceTelemetryModule] txGood=0,rxGood=0,rxBad=0 ??:??:?? 35 [DeviceTelemetryModule] Using channel 0 (hash 0x6e) ??:??:?? 35 [DeviceTelemetryModule] Expanding short PSK #1 ??:??:?? 35 [DeviceTelemetryModule] Using AES128 key! ??:??:?? 35 [DeviceTelemetryModule] nRF52 encrypt fr=47d2c6c4, num=6b09018c, numBytes=18! ??:??:?? 36 [DeviceTelemetryModule] decoded message (id=0x6b09018c Fr0xc4 To0xff, WantAck0, HopLim3 Ch0x0 Portnum=67 priority=64) ??:??:?? 36 [DeviceTelemetryModule] updateTelemetry LOCAL ??:??:?? 36 [DeviceTelemetryModule] Node status update: 1 online, 1 total ??:??:?? 36 [DeviceTelemetryModule] showing standard frames ??:??:?? 36 [DeviceTelemetryModule] Showing 0 module frames ??:??:?? 36 [DeviceTelemetryModule] Total frame count: 83 ??:??:?? 36 [DeviceTelemetryModule] Added modules. numframes: 0 ??:??:?? 36 [DeviceTelemetryModule] Finished building frames. numframes: 2 ??:??:?? 36 [DeviceTelemetryModule] Setting fast framerate ??:??:?? 36 [Screen] Setting idle framerate ??:??:?? 37 [RadioIf] Starting low level send (id=0x6b09018c Fr0xc4 To0xff, WantAck0, HopLim3 Ch0x6e encrypted priority=64) ??:??:?? 37 [RadioIf] (bw=250, sf=11, cr=4/8) packet symLen=8 ms, payloadSize=34, time 772 ms ??:??:?? 37 [RadioIf] AirTime - Packet transmitted : 772ms ??:??:?? 37 [PowerFSM] Enter state: POWER Crashes here

porkcharsui commented 2 years ago

I am also seeing a bootloop (stuck on meshstatic logo) after calling the factory reset command using python CLI. Before the reset was issued, the device had just been reflashed and minimal settings configured: owner, Lora region, channel config.

meshtastic-python: https://github.com/meshtastic/Meshtastic-python/commit/03aab10786c293fc638f69f04769f059a333e739 firmware_version: firmware-tbeam-1.3.34.401b5d9.bin

I'm assuming that without a GPIO serial debug setup enabled, I will be unable to produce the necessary logs. Is this correct?

Douile commented 2 years ago

Also happens on my heltec 2.0 board

Douile commented 2 years ago

I believe the bricking issue (at least with CLI) is that the node doesn't report max_channels after factory reset:

Factory reset device

DEBUG file:mesh_interface.py _handleFromRadio line:492 Received from radio: my_info {
  my_node_num: 4067497268
  bitrate: 74.3412857
  message_timeout_msec: 300000
  air_util_tx: 0.0506111123
}

Freshly flashed device

DEBUG file:mesh_interface.py _handleFromRadio line:492 Received from radio: my_info {
  my_node_num: 4067497268
  max_channels: 8
  firmware_version: "1.3.38.1253abd-d"
  reboot_count: 1
  bitrate: 74.3412857
  message_timeout_msec: 300000
  min_app_version: 20300
  has_wifi: true
  channel_utilization: 1.15
  air_util_tx: 0.019166667
}

The CLI uses max_channels reported by this protobuf to determine whether the device has an appropriate version.

This could however just be a symptom as nothing on the device itself appears to have reset:

garthvh commented 2 years ago

I believe the bricking issue (at least with CLI) is that the node doesn't report max_channels after factory reset:

I have seen zero cases of bricked devices, all have been recoverable by refreshing the device. It is really hard to brick an ESP32 dev board.

Douile commented 2 years ago

By bricked I mean it can't communicate with any of the apps, flashing still works.

garthvh commented 2 years ago

By bricked I mean it can't communicate with any of the apps, flashing still works.

People get nervous when they hear bricked, generally means the hardware is done for.

garthvh commented 2 years ago

When trying to use factory reset to disable WiFi I get boot loops, it still thinks WiFi is enabled and has a region set as well

caveman99 commented 2 years ago

We've had issues with run time of garbage collection on LittleFS before, maybe it's resetting before the device completes the write cycle so it still sees the old files on bootup?

caveman99 commented 2 years ago

another approach to factory reset would be to simply loop the flash FS area and null it. Not use the FS functuions at all. LittleFS will init the flash area if it doesn't find a valid partition marker.

Douile commented 2 years ago

Waiting longer before reboot might also work. When writing #1653 I initially had at 15 second reboot to get stuff to fully clear, but then decreased to 5 and it still seemed to be working.

caveman99 commented 2 years ago

Is this still a problem with the current master build? ref. https://github.com/meshtastic/Meshtastic-device/suites/8445496242/artifacts/374421783

thebentern commented 2 years ago

meshtastic --factory-reset has been working reliably for me on the latest builds. Closing for now

ajorg commented 2 years ago

I believe I'm still having related problems, but it's hard to tell because in my T-Echo's current state, it bootloops on every version of Meshtastic I try, after calling factory reset. Since I can't get it to not bootloop I can't reset it. It still runs other firmware life SoftRF fine, but even 1.2 can't cope with the state the config is in now.

garthvh commented 2 years ago

There is a reset .uf2 file that is pinned to the #nrf52 channel on discord, try dropping that on to the drive and then connecting to it via serial (meshtastic --noproto) That will wipe all saved data from the device.

ajorg commented 2 years ago

Thanks! That did allow me to recover from this bad state.

sachaw commented 2 years ago

I think we need to either build a fix into the firmware or make the step of clearing the flash more obvious

emumanx commented 9 months ago

since using meshtastic --noproto I am stuck in a loop where the serial port sends out the "press any key to contiune factory reset" it says "Formating ..... Done!" but then on reset or rest via button it loops back to "Press any key to continue factory reset" DFU no long works via the reset button twice and its just a paper weight now. any ideas or ?