mrrwa / NmraDcc

NMRA Digital Command Control (DCC) Library
GNU Lesser General Public License v2.1
135 stars 53 forks source link

ESP8266, exception and wdt reset #65

Closed sierramike closed 1 year ago

sierramike commented 1 year ago

Hello,

I'm trying to run this library on ESP8266 which is stated to be compatible, but I get errors.

I've hooked up the DCC signal on D3 pin and uploaded the example "NmraDccMultiFunctionDecoder_1", after changing DCC_PIN to D3.

The sketch is working, I see the speed and function changes displayed in the serial monitor, but as soon as I try to send a write CV from the MultiMaus, I get an exception : " ets Jan 8 2013,rst cause:2, boot mode:(3,6)".

Then I tried "NmraDccMultiFunctionMotorDecoder" example, uncommented the DEBUG defines and defined the DCC_PIN to D3 (and LED and MOTOR pins even if nothing is hooked up on the physical pins).

Uploaded this sketch and I constantly get wdt reset errors in the serial monitor followed by the line "NMRA Dcc Multifunction Motor Decoder Demo" : ets Jan 8 2013,rst cause:4, boot mode:(3,6)

wdt reset load 0x4010f000, len 3584, room 16 tail 0 chksum 0xb0 csum 0xb0 v2843a5ac ~ld NMRA Dcc Multifunction Motor Decoder Demo

This error shows even when DCC signal is turned off.

Can someone point me out why this happens?

kiwi64ajs commented 1 year ago

Hmmm… I’ve not played with this library on the ESP8266 as it was contributed by someone else but I know the EEPROM on this chipset is emulated using FLASH so there must be something going wrong with that part of the library.

I’m not going to be able to look at it for some time but that is where someone needs to look.

Alex

On 3/11/2022, at 1:53 PM, sierramike @.***> wrote:

Hello,

I'm trying to run this library on ESP8266 which is stated to be compatible, but I get errors.

I've hooked up the DCC signal on D3 pin and uploaded the example "NmraDccMultiFunctionDecoder_1", after changing DCC_PIN to D3.

The sketch is working, I see the speed and function changes displayed in the serial monitor, but as soon as I try to send a write CV from the MultiMaus, I get an exception : " ets Jan 8 2013,rst cause:2, boot mode:(3,6)".

Then I tried "NmraDccMultiFunctionMotorDecoder" example, uncommented the DEBUG defines and defined the DCC_PIN to D3 (and LED and MOTOR pins even if nothing is hooked up on the physical pins).

Uploaded this sketch and I constantly get wdt reset errors in the serial monitor followed by the line "NMRA Dcc Multifunction Motor Decoder Demo" : ets Jan 8 2013,rst cause:4, boot mode:(3,6)

wdt reset load 0x4010f000, len 3584, room 16 tail 0 chksum 0xb0 csum 0xb0 v2843a5ac ~ld NMRA Dcc Multifunction Motor Decoder Demo

This error shows even when DCC signal is turned off.

Can someone point me out why this happens?

— Reply to this email directly, view it on GitHub https://github.com/mrrwa/NmraDcc/issues/65, or unsubscribe https://github.com/notifications/unsubscribe-auth/AB5Y53O2443ER4L3TZGQWFLWGMEIFANCNFSM6AAAAAARVUV7YE. You are receiving this because you are subscribed to this thread.

devel-bobm commented 1 year ago

Sounds like the "Watchdog timer" feature is enabled but there isn't any code which is providing the periodic "petting" that the Watchdog timer requires in order to hold-off its reset of the device.

While I am not familiar with the ESP processors, many other microcontrollers have a "configuration-time setting" which can enable/disable the microcontroller's Watchdog timer. For example, with Microchip PIC processors, there's an additional word of configuration data which determines things like oscillator settings, "write-protect" of program memory chunks, read-protection of program memory, and "Watchdog timer" enable/disable. Perhaps all that is necessary is to figure out how to disable the ESP processor's Watchdog timer, and then configure your build/download environment to properly configure that bit...

sierramike commented 1 year ago

Thanks for pointing me out on the EEPROM. I dug into the library's code, did some debugging and found out that the crash occurs on the EEPROM commit call. Googling on this I found that when an interruption occurs while writing to flash it may crash the device.

So I modified the library this way : void writeEEPROM (unsigned int CV, uint8_t Value) {

if defined(ESP8266) || defined(ESP32) || defined(ARDUINO_ARCH_RP2040)

noInterrupts();
#endif
EEPROM.write (CV, Value) ;
#if defined(ESP8266) ||  defined(ESP32) || defined(ARDUINO_ARCH_RP2040)
EEPROM.commit();
interrupts();
#endif

}

A bit lazy to become contributor and post this fix, and I don't have ESP32 nor RP2040 to test on these devices but my google searches pointed this interrupt/flash issue on ESP32, so I'm sure the fix is for both ESP8266 and ESP32. Not sure regarding RP2040 ...

sierramike commented 1 year ago

As a side note, the wdt reset issue was caused by myself wrongly setting a LED pin to the same pin as the DCC signal input ... Sorry for that. But the second crash still occured on both sketches and was linked to the flash write.

kiwi64ajs commented 1 year ago

We really need to refactor the whole CV storage mechanism as things have moved on a lot since the initial AVR implementation.

I’d like to shift it out to another CV-Storage Class that has a default EEPROM implementation but allows for a FLASH Based implementation which has local buffering of CV valued in RAM that would allow you to buffer changes and then do a final commit to FLASH in bulk. It has a few down-sides but should be more manageable or efficient than writing to a FLASH segment for every CV Byte value as its a but impacting on the device.

With more and more devices NOT having EEPROM, assuming EEPROM isn’t available on-chip is probably a safer assumption and anticipate using some other external storage like FRAM etc.

Alex

On 4/11/2022, at 3:32 AM, sierramike @.***> wrote:

Thanks for pointing me out on the EEPROM. I dug into the library's code, did some debugging and found out that the crash occurs on the EEPROM commit call. Googling on this I found that when an interruption occurs while writing to flash it may crash the device.

So I modified the library this way : void writeEEPROM (unsigned int CV, uint8_t Value) {

if defined(ESP8266) || defined(ESP32) || defined(ARDUINO_ARCH_RP2040)

noInterrupts();

endif

EEPROM.write (CV, Value) ;

if defined(ESP8266) || defined(ESP32) || defined(ARDUINO_ARCH_RP2040)

EEPROM.commit(); interrupts();

endif

}

A bit lazy to become contributor and post this fix, and I don't have ESP32 nor RP2040 to test on these devices but my google searches pointed this interrupt/flash issue on ESP32, so I'm sure the fix is for both ESP8266 and ESP32. Not sure regarding RP2040 ...

— Reply to this email directly, view it on GitHub https://github.com/mrrwa/NmraDcc/issues/65#issuecomment-1302204682, or unsubscribe https://github.com/notifications/unsubscribe-auth/AB5Y53OCXGMGZWW5FRIE253WGPEGHANCNFSM6AAAAAARVUV7YE. You are receiving this because you commented.

kiwi64ajs commented 1 year ago

As a side note, the wdt reset issue was caused by myself wrongly setting a LED pin to the same pin as the DCC signal input ... Sorry for that. But the second crash still occured on both sketches and was linked to the flash write.

@sierramike Did your modification to the library to add the noInterrupts();" before the EEPROM.write + EEPROM.commit(); and an interrupts(); at the end resolve the crash?

kiwi64ajs commented 1 year ago

Awaiting response from @sierramike as the changes are working for my trials