loathingKernel / ariadne-bootloader

A little less unfinished TFTP bootloader for Arduino Ethernet or Arduino with Ethernet Shield
45 stars 18 forks source link

Bootloader timeout duration different from what is stated in the documentation #8

Closed per1234 closed 8 years ago

per1234 commented 8 years ago

README.md says:

Now to actually upload the binary file all you have to do is reset the board and in the next 5 seconds run the following command.

but the bootloader timeout is actually 20 seconds. Was it intentional to make the bootloader timeout so long? 20 seconds seems a bit longer than necessary.

loathingKernel commented 8 years ago

When I wrote the README.md, the timeout was indeed about 5 seconds, but as more timer resets were introduced to the serial and tftp subsystems as well as the wdt, this became a mess of race conditions between the timers. So the timer repeats were increased without documenting it to stop it from interfering and to give enough time for the user to upload a new sketch because there was no support for the EEPROM signatures, ergo the confusion. I believe there is a better way to do this whole thing but it has to do with other stuff too.

I will document them here for the sake of safekeeping even though not directly related but they interact in some way or in corner-cases.

First of all, I would like to remove many of the values stored in EEPROM, effectively leaving only the network settings (these can be removed as well, maybe, but programs in the PROGMEM cannot write the bootloader sections and having the bootloader rewrite parts of itself can be tricky and not straight forward or even possible). The version of the bootloader should be in the .text section of the binary, much like how optiboot does it. The correct upload signatures should be removed and instead a better image validation should be implemented. This consists of two parts, first to support validations for all supported boards and second, if the upload fails, to zero the first page of the PROGMEM to prohibit any application from running (I think this can be done by a sketch in the PROGMEM too, so resetting for reprogramming will continue to work). Not sure if all this is accurate, should investigate.

We should also add support for detecting the cause of the reset. If it was from watchdog, reset should be fast, if we want to reprogram, the application won't boot anyways, if it was a hardware reset, it should take longer, imho, because forcing the user to interact fast is not nice, 20 seconds is excessive, but I believe 10 would be ok. This is subjective though so I might be wrong.

Right now we have a wdt for the whole bootloader if it hangs, and a timer for the tftp timeout and the led. As you said, this is the core cause for the weird timeout behaviour. Using only the watchdog, is less than optimal, because in case of a timeout or an incomplete transfer, tftp has the ability to request a packet to be sent again, so we do not need to restart the transfer. On the other hand, the serial part, heavily relies on the wdt because the bootloaders that we are using relied on that. I am not sure how we should fix that, and this has been confusing me for some time now because I have been unable to find a solution that I am going to be happy with. Any ideas or a different perspective is as always welcome.

per1234 commented 8 years ago

I would like to remove many of the values stored in EEPROM, effectively leaving only the network settings

I agree, better to leave as much of the EEPROM still available to the user as possible.

The version of the bootloader should be in the .text section of the binary, much like how optiboot does it.

I've had that on my to do list but haven't taken more than a quick look at the optiboot implementation yet.

20 seconds is excessive, but I believe 10 would be ok. This is subjective though so I might be wrong.

I agree that 10 seconds is reasonable. Some people are going to be sending the reset command and then the TFTP command manually so they do need a little time. I'm not sure what kind of delay may be necessary to allow using Ariadne over the internet over long distances but it seems that 10 seconds should cover it.

I had the idea of a feature where if EEPROM_IMG_STAT was set to a certain value then Ariadne would write EEPROM_IMG_OK_VALUE and then immediately exit to the user application. This would allow the user to enable the bootloader only when the reset to upload command was received in the user code. However, if I understand you correctly, you're considering removing EEPROM_IMG_STAT and it might not be worth dedicating a byte of EEPROM just for this feature.

loathingKernel commented 8 years ago

I'm not sure what kind of delay may be necessary to allow using Ariadne over the internet over long distances but it seems that 10 seconds should cover it.

If they are uploading through a Marconi telegram hitting every bit by hand, well, let's just tell them that it is not supported. Ten seconds should be more than enough and let's disregard the nonsense I was thinking when I made it 20 seconds.

However, if I understand you correctly, you're considering removing EEPROM_IMG_STAT and it might not be worth dedicating a byte of EEPROM just for this feature.

It is not as much as conserving space in the EEPROM but saving space in the bootloader (the code size of the checks alone is enough for implementing IntelHEX to BIN conversion, for example). Also, it makes it less complicated and more transparent to the user. I don't believe we need an extra value to indicate direct boot. The following should be enough:

The 'just wait' part is going to happen anyways if there is no valid program so it pretty much boils down to

(If my assumptions in the previous post are correct)