Open matthiasbock opened 6 years ago
Might be a CANtact problem: Frame reception can become non-functional during the flash write.
A vector CAN adapter on the same bus received frames with ID 0x081 (bootloader response datagrams), which were not received by the bootloader client which used the CANtact. This means, datagram transmission continues to work while frame reception on the CANtact becomes disabled.
Commit b918b949b43862ffe3222df910c634ba85ccea2b introduces a couple of enhancements, which hopefully solve this issue or at least make it happen with lower incidence.
Since commit 1c45329142ba459a81f4169ad257587b3eb184e2 target flash area not erased properly errors are fatal, see also issue #10. Even with commit 0238a718704229933317e30673bccd5dfe052687 they still occur sometimes, but now cause the flash procedure to abort immediately, since such procedures are known to always fail checksum verification.
TODO: After erasing flash memory the bootloader should verify, that no data is left in the erased area.
The problem persists with flash erase verification.
Client exit upon fatal error was disabled. Output of datagram error messages was disabled.
bootloader_flash -i can0 1 -c "Test" -f test.elf
Flashing firmware, size: 78380 bytes
Erasing pages...
100% (78380 of 78380) |############################################################################################################################| Elapsed Time: 0:00:03 Time: 0:00:03
Writing pages...
65% (51200 of 78380) |################################################################################# | Elapsed Time: 0:00:25 ETA: 0:00:13
ERROR:root:Board 1 reports error 24 (target flash area not erased properly)
CRITICAL:root:The following board failed to write flash pages: 1
100% (78380 of 78380) |############################################################################################################################| Elapsed Time: 0:00:35 Time: 0:00:35
WARNING:root:Errors occured, the flash procedure might have failed on some destinations.
Updating bootloader configuration page...
Updated.
Verifying firmware...
Expecting checksum: 0xc2f301b0
Node 1 reports checksum: 0xec8da51
Verification failed for nodes 1
The Python client packaged from commit 75df0c67a0e76f2980b65950be89b451310db09e shows erroneous behaviour when attempting to flash a node on a CAN bus where other nodes are communicating a lot (about one third-party frame every 50ms).
The occurence of datagram errors is probably understandable when there is a lot of chatter on the bus. But the occurence of the target flash area not erased properly error is surprising, since the flash erase phase doesn't seem to be subject to transmission errors. They are in general unlikely because the corresponding erase command datagrams are rather small. Here they apparently also have all been acknowledged by a 'success' response.
The most likely explanations for this problem are:
The datagram errors might have to be tackled as well though.