malariaspot / microspot-fw

GRBL adaption for the MicroSpot automated microscope
Other
0 stars 1 forks source link

Firmware refuses to boot after a random amount of time #6

Open gvJaime opened 2 years ago

gvJaime commented 2 years ago

Randomly, the firmware will refuse to boot, rendering the device unusable until a re-flash is done.

Flash dump indicates that the firmware is still in flash, unmodified and uncorrupted.

T_RST pin is HIGH; indicating that the ATMega is not in reset state.

gvJaime commented 2 years ago

M_TX is low. When Reset is presed, M_TX is still low

gvJaime commented 2 years ago

This indicates that what is breaking is the TMC_init routine failing.

The fact that the routine goes from consistently being successful, to consistently failing, and the fact that it recovers after a reflash is a complete mystery.

gvJaime commented 2 years ago

Config motores tras encendido

This is an image of M_TX during a 4 TMCinitialization routine. The success or failure of it is still not confirmed

gvJaime commented 2 years ago

Another theory is that the USART that is used for the TMCs does not turn on

gvJaime commented 2 years ago

Screenshot 2022-04-06 at 14 23 47

This line sets TX to HIGH at the start of TMC_init(). So if M_TX is low, it is expected that the ATMega4809 does not even reach that point.

So it's back to the no boot hypothesis

gvJaime commented 2 years ago

https://www.avrfreaks.net/forum/atmega4809-bootloader-starting-application

gvJaime commented 2 years ago

http://ww1.microchip.com/downloads/en/AppNotes/AN2634-Bootloader-for-tinyAVR-and-megaAVR-00002634C.pdf

gvJaime commented 2 years ago
/*
 * Boot access request function
 */
static inline bool is_bootloader_requested(void)
{
  /* Check for boot request from firmware */
  if (USERROW.USERROW31 == 0xEB) {
    /* Clear boot request*/
    USERROW.USERROW31 = 0xff;
    _PROTECTED_WRITE_SPM(NVMCTRL.CTRLA, NVMCTRL_CMD_PAGEERASEWRITE_gc);
    while(NVMCTRL.STATUS & NVMCTRL_EEBUSY_bm);

    return true;
  }
  return false;
}

This function within the bootloader would divert the execution onto uploading mode. It's depending on USERROW31, if the value is 0xEB, the microcontroller will go into uploading mode and the execution will not progress. It is true that USERROW31 gets cleared immediately after, but if for some reason the EEPROM writes became locked, the clear would not happen and the processor would be stuck there.

Hypothesis test

To test this hypothesis, the following command needs to be run on a capable machine:

avrdude -c atmelice_updi -P usb -p atmega4809 -U eeprom:r:filename.hex:i

then, convert that .hex into .bin, and somehow locate USERROW31 byte and check if it is EB. That would confirm the theory.