Some input files are written incompletely to flash

stefanrueger commented 2 years ago

Avrdude does not necessarily write flash correctly when the input file ends in a series of one or more 0xff. This can happen because avrdude's fileio() unconditionally asks avr_mem_hiaddr() to remove the 0xff bytes at the end of the input file even when auto-erase is disabled with -D or when using a programmer that cannot perform a chip erase, eg, upload via the popular arduino optiboot bootloader. avr_mem_hiaddr() also incorrectly forces input for flash to always contain an even number of bytes, but that is normally less of a problem though it can be.

This sketch-ending-in-ff.hex for an ATmega328P has 2185 bytes, but only 2160 bytes are reported to have been written; well, it'll probably be 17 pages, ie, 2176 bytes, still falling short of 9 vital bytes:

$ avr-size sketch-ending-in-ff.hex 
   text    data     bss     dec     hex filename
      0    2185       0    2185     889 sketch-ending-in-ff.hex

$ avrdude -Dqp atmega328p -c arduino -b 115200 -P /dev/ttyUSB0 -U flash:w:sketch-ending-in-ff.hex:i

avrdude: AVR device initialized and ready to accept instructions
avrdude: Device signature = 0x1e950f (probably m328p)
avrdude: reading input file "sketch-ending-in-ff.hex"
avrdude: writing flash (2160 bytes):
avrdude: 2160 bytes of flash written
avrdude: verifying flash memory against sketch-ending-in-ff.hex:
avrdude: 2160 bytes of flash verified

avrdude done.  Thank you.

Correcting this bug would be easy --- simply condition the line rc = avr_mem_hiaddr(mem); in fileio.c with if(chiperased) --- was it not for the fact that it's kinda hard for fileio() to figure out whether or not the chip will have been erased before programming commences.

Thoughts? Global variable? Remove the haircut in fileio.c altogether?

BTW, the sketch was created using avr-gcc 5.4.0 (tight code) and the following source

#include "config.h"

const unsigned char pff[25] __attribute__((section (".progdata"))) = {
  0xff,  0xff,  0xff,  0xff,  0xff,  0xff,  0xff,  0xff,  0xff,  0xff,  0xff,  0xff,  0xff, 0xff, 0xff, 0xff,
  0xff,  0xff,  0xff,  0xff,  0xff,  0xff,  0xff,  0xff,  0xff, 
};

int main() { 
  boardInit(57600, PSTR("test...\n"));
  printf_P(PSTR("Important table beg: 0x%02x\n"), pgm_read_byte(pff));
  printf_P(PSTR("Important table end: 0x%02x\n"), pgm_read_byte(pff+sizeof pff-1));
}

It can be helpful to store data at the top of a large flash memory, particularly when PROGMEM data threaten to flow beyond the bottom 64 kB space. In this case PROGMEM data are not guaranteed to be reachable by *_P() functions. Not all *_P() functions have a *_PF() equivalent in the avr-libc library (which can reach beyond 64 kB). Hence, it makes sense to put all those data at the top of the sketch that only need be processed by *_PF() functions. This leaves (more of) the remaining PROGMEM data in reach of *_P() functions.

dl8dtl commented 2 years ago

We've had this discussion before, and I still disagree. If a bootloader is asked for a chip erase, it's obvious it will never actually erase the entire chip, but it's still the obligation of the bootloader to handle as if it had performed a chip erase. It's not really important when it is going to perform the actual (page) erase, but the observable effect must be the same as a chip erase, modulo the boot loader itself, obviously. The background is simple: NOR flash can never write bits from 0 to 1. That transition can only ever happen by an erase operation. Thus, writing cells to 0xff(ff) is always a no-op, and can be omitted. (For EEPROM, the AVR sometimes transparently invokes a prior erase operation, but even that's not always automatic, so there are situations where attempts to write 1 bits are just ignored by the backend tool.)

stefanrueger commented 2 years ago

What about -D?

The problem is that fileio() thinks the chip is erased, but it isn't necessarily so, because the user (for whatever reason) has asked avrdude not to. A global variable and adding one line of code would help everyone.

dl8dtl commented 2 years ago

For -D it's simply the user's responsibility to not attempt to write anything on non-erased cells. This option makes e.g. sense to first write a bootloader (without -D, usual default erase semantics), and subsequently a first application where -D ensures the bootloader is not erased again. But that's really nothing about AVRDUDE but about how NOR flash works.

stefanrueger commented 2 years ago

The given real example is an input file that requires 18 pages to be programmed, but only 17 pages get written.

Avrdude looks at the last page of the input file, decides the programmer does not need to see it and does not send it
The user does everything they can: they design, code, compile and finally create a valid hex input file
The programmer does everything it can: every page avrdude sends along is diligently erased and correctly filled

How is this not a bug?

In my view the workings of NOR memory is a red herring.

dl8dtl commented 2 years ago

The programmer does everything it can:

It doesn't. There is no "page erase" ever issued by AVRDUDE, only a chip erase. (Because page erases haven't been part of the Atmel tool commandset, until recently.) It's the programmer's responsibility to behave as if it had erased the chip. If it doesn't, it's faulty. You simply cannot program any 1-bit. Thus, blocks of 1-bits can be omitted when writing; they would never change any memory content.

stefanrueger commented 2 years ago

The crux of the matter is that avrdude does not send all of the input file to the programmer.

If that were correct behaviour, then, out of necessity the programmer would then need to erase all of the remaining flash all of the time just in case that the input file had trailing 0xff bytes that avrdude decided to withhold.

To the best of my knowledge, it is not common practice for programmers to erase the remainder of the flash.

Hence, avrdude should send all of the input file to the programmer unless it can be sure that the chip had been erased.

dl8dtl commented 2 years ago

it is not common practice for programmers to erase the remainder of the flash.

The respective command is called "chip erase". So yes, the programmer is supposed to erase everything.

stefanrueger commented 2 years ago

How does that work with a bootloader?

stefanrueger commented 2 years ago

I mean, would chip erase not destroy the bootloader, too?

dl8dtl commented 2 years ago

I mean, would chip erase not destroy the bootloader, too?

Sure, but then, the expectation is of course always that the bootloader will protect itself – against a chip erase, but also against any input file that attempts to overwrite it (and apart from that, it cannot e.g. modify fuses).

But upon being told to erase the chip, it must erase the application area (and usually also the EEPROM), as its closest possible approximation of erasing the chip.

Things would be different if we started to implement page erases (I think recent Atmel/Microchip tools allow for that): then, I agree, any page of 0xffff in the input file would have to communicate a page erase to the programmer.

stefanrueger commented 2 years ago

[edited to replace mistakenly written -U with -D what was meant]

And what if the user specifies -D?

The programmer still would have to erase the remainder of the flash every time avrdude sends a file because it cannot know whether avrdude withheld trailing 0xffs. And do they really do that in practice? In virtually all circumstances this would have little utility and only make the user wait longer for programming to finish.

Bootloaders have a legitimate interest to save space. Even if a bootloader did implement chip erase, the user could still decide to issue -D.

It would be good if avrdude did not withhold trailing 0xffs from the programmer when the user specifies -D.

What do other people think?

Or has everyone gotten a packet of popcorn for watching the discussion ;)

MCUdude commented 2 years ago

What do other people think? Or has everyone gotten a packet of popcorn for watching the discussion ;)

I'm here for the popcorn! 🍿 Jokes aside, I'm somewhat neutral, and know way less than @dl8dtl does about how AVR programmers actually works.

I'll have to admit, this has never really been a problem I've encountered before.

Optiboot always does what it should when using the -D flag. The bootloader has to do some kind of erase (page or entire flash) because the flash contents always match the input hex file after upload. 0's are turning into 1's somehow.

When uploading using an ISP, it works perfectly even when not using the -e flag before -U. Not sure if this is a bug or a feature though.

mariusgreuel commented 2 years ago

This thread is kind of fun to read. From what I understand is that the -D has serious implications that people failed to understand. This includes me personally, and sadly, also the Arduino people (they use -D with Optiboot, too).

As I understand Jörg, the -D option must only be used if it is guaranteed that the entire flash has been previously erased. This severely limits its use, for instance for a sequence of programming commands, where the first programming command already included the chip erase (i.e. without the -D).

I used to interpret the -D option as "if the programmer is capable of doing a page erase, it is OK to skip the chip erase". Now, I knew about the 0xFF optimization in avrdude, but I failed to understand the implications.

That said, it is always frustrating for me, trying to do everything right, only to find out that I walked into a trap that was not obvious to me. Meaning after some years of AVR programming, I would need to change my makefiles :-(.

Now, I am a great fan of software that "just works" also for casual users. From that perspective, I support the idea of defusing this bomb. Disabling the 0xFF optimizations when -D is used seems like a reasonable suggestion that would avoid unpleasant surprises.

The fact that a verification command only verifies 2160 bytes is an undebatable bug to me:

avrdude -Dqp atmega328p -c arduino -b 115200 -P com7 -U flash:v:sketch-ending-in-ff.hex:i

avrdude: AVR device initialized and ready to accept instructions
avrdude: Device signature = 0x1e950f (probably m328p)
avrdude: verifying flash memory against sketch-ending-in-ff.hex:
avrdude: 2160 bytes of flash verified

avrdude: safemode: Fuses OK (E:00, H:00, L:00)

avrdude done.  Thank you.

dl8dtl commented 2 years ago

It's exactly the point: if optiboot claims to act as an STK500 (by implementing its protocol), it has to work the way the STK500 used to work. (Obviously, modulo everything that would try to touch the bootloader area itself.) As the STK500 protocol has a chip erase as the only erase option, that one needs to ensure the application part of the flash is erased in order to be compatible. As a consequence, that limits the applicability of using the -D option to cases where it is sure the remainder of the flash is still erased (either by factory, or by a prior chip erase). I agree though that verification is supposed to verify the entire file range, so yes, applying a shortcut early is a bug. It might really make sense to think about the EDBG protocol allowing for partial erase operations, at least for modern AVRs (Xmega and beyond), and thus to teach AVRDUDE to handle that. That would also allow for bootloaders to implement this protocol, to benefit from that more modern behaviour. But, so far, nobody did it (neither in AVRDUDE, nor that I knew of a respective bootloader implementation).

stefanrueger commented 2 years ago

I'll have to admit, this has never really been a problem I've encountered before.

It only surfaces when one puts data at the end of the application (PROGMEM puts it at the beginning) and if there is a trailing section of 0xff bytes that ends up in a page of its own. Very difficult to debug, because every tool keeps telling you all is hunky-dory.

Optiboot always does what it should when using the -D flag.

And it does what it should even when not specifying -D. Optiboot simply ignores the chip erase command that avrdude sends. Avrdude's arduino programmer uploads the sketch page by page to optiboot. Fun fact: avrdude sends even pages that only consist of 0xff unless they're part of the trailing 0xffs; no optimisation is actually taking place here. And optiboot, as every bootloader worth its salt, carries out a page erase, loads the page into the chip buffer and writes the page to flash, all with the avr SPM opcode.

I know Optiboot vvv well, because I wrote my own much smaller bootloaders, called urboot, initially based on optiboot. Mine has typically 256 bytes (flash only) or 384 bytes (flash + EEPROM r/w) but can be as small as 224 bytes (for the ATtiny2313). In contrast, optiboot is 512 bytes.

The really small sizes need a different programmer than avrdude's arduino. That issues chip erase, parameter get and set commands, software version queries, fuse read etc commands. Even just ignoring all these cost some 50 bytes to keep the protocol up. Then there is avrdude's signature byte request, which again costs some 30 bytes code to serve even if only the compile-time constants SIGNATURE_0...2 bytes are sent. I found a truly magnificent side-channel of telling the programmer which chip the bootloader runs on for 0 bytes extra code. And that doesn't require that the signatures of avrdude and avr-gcc match up (something I found out the hard way they don't necessarily). So, I wrote my own programmer for avrdude (called urclock) that still uses by and large the STK500v1 protocol, but significantly reduces the overhead of getting business done.

Small parts don't have bootloader support, but there is @WestfW 's wonderful idea of a vector bootloader (I think he calls it a virtual bootloader). The idea is that the bootloader patches the reset vector to jump to itself while using an otherwise unused ISR vector to jump to the application. Patching while uploading and verifying the sketch costs some 110 bytes in optiboot. However, if you shift the burden of patching to the programmer (which probably runs on a 16+ Gigabyte PC and doesn't even care about megabytes) then implementing this costs 0 (zero) extra bytes in the bootloader. I know, I've counted twice. And if you're unsure whether there is an ISR vector that is truly unused in all applications then you're always free to extend the .vectors section by one last entry for the bootloader (OK, that'll cost you 2-4 bytes).

The only other change (apart from adding the urclock programmer code to the project) is that avrdude needs to provide a read-hook function in the programmer definition that is called when an input file is read, so the programmer can patch the vector table before upload. @WestfW's optiboot would equally benefit from the availability of such an avrdude programmer.

Actually, this read-hook might give the programmer the control over whether or not the 0xff-haircut we are talking about here should be done, rather than fileio().

So, my whole philosophy has been to shift as much effort as possibly to the programmer rather than the bootloader. That requires a certain level of co-operation of the programmer with the bootloader.

In that sense this discussion is like a multi-billionaire (16 GByte PC) asking someone just managing (8k AVR) to work for them rather than the other way round. Hang on a second, ... that is probably not the greatest analogy I have ever come up with :)

Over the years I have come to realise that I benefitted a lot from the open-source community and people like @dl8dtl, @WestfW and yourself, so made a New-Year resolution to give some of my work back to the community.

Publishing the urboot bootloader necessitates that I publish the urclock programmer, too. My preference would be to eventually make this part of the avrdude project rather than one of the many forks that are out there.

Anyway, I deviate...

| When uploading using an ISP, it works perfectly even when not using the -e flag before -U.

Feature: Avrdude by default erases flash (either page by page or by issuing a chip erase) unless -D is specified. So the explicit -e isn't needed for flash programming.

stefanrueger commented 2 years ago

And btw, ... most bootloaders wouldn't implement the full monty of STK500. They typically only ever implement a skeleton of what's needed. Otherwise we'd all work with 8k bootloaders which doesn't leave much change in small devices...

dl8dtl commented 2 years ago

I disagree. The entire STK500 fits into an ATmega8535, and that also includes all the HV programming algorithms for all supported chips.

stefanrueger commented 2 years ago

OK - I suspect in this case that the ATmega8535 acts as physical device to program other chips (in collaboration with avrdude) and does nothing else. In the case of bootloaders they (often) only provide the own chip with application code and data but then the chip does something completely different. It is not in the interest of bootloaders to do everything.

mariusgreuel commented 2 years ago

@stefanrueger The Micronucleus bootloader (for ATtinys) needs to patch the reset vectors while programming. I managed that by hooking into the paged_write() function. Is this similar what you need to do? https://github.com/avrdudes/avrdude/blob/7e26a153752c9ebf346bd4e5db2e7c9b78a8b3c6/src/micronucleus.c#L342

stefanrueger commented 2 years ago

@mariusgreuel Yes, correct - my code is a little different, because a compiled sketch may use an rjmp to the application on a large part that requires a jmp to the bootloader, so yes by and large this type of thing. Hadn't thought of using the paged_write() hook. How do you verify?

mariusgreuel commented 2 years ago

Micronucleus does not implement reading. I guess one could do something similar in paged_load().

stefanrueger commented 2 years ago

... in which case the user wouldn't get what's actually on the chip. I decided to put a readhook in fileio() to keep what's on the chip transparent to the user.

stefanrueger commented 2 years ago

Micronucleus does not implement reading.

... and ... does it erase the remainder of the flash all the way up to the bootloader? just in case there are 0xff?

MCUdude commented 2 years ago

Now, I am a great fan of software that "just works" also for casual users. From that perspective, I support the idea of defusing this bomb. Disabling the 0xFF optimizations when -D is used seems like a reasonable suggestion that would avoid unpleasant surprises.

I agree with this.

Another way to make everyone happy, would it make sense to add a new flag, -d perhaps, that enables/disable 0xff optimizations. Thoughts?

mariusgreuel commented 2 years ago

would it make sense to add a new flag, -d perhaps, that enables/disable 0xff optimizations

I thought about something like this, too. Turning the optimization off, by default.

It may be possible for avrdude to somehow "know" which programmers are "-D compatible" (for instance via a whitelist), but this is going to be tricky, as you do not necessarily know the real programmer behind a given protocol.

Optiboot simply ignores the chip erase command that avrdude sends

I was pretty sure -e worked with Optiboot, soI just tried it and also checked its code - but yes, you are correct. Bummer!

To me, this means the 0xFF optimization just became obsolete - no matter whether -D is used or not. Obviously, nobody is going to re-flash the bootloader on millions of Arduinos.

stefanrueger commented 2 years ago

A good and workable solution is to piggy-back on -D for "do not optimise 0xff". Rationale below.

I suggest changing avr_mem_hiaddr() itself to make it decide on whether or not to carry out optimisation. As this gives avr_mem_hiaddr() agency over the optimisation decision I also suggest moving the if "boot/application/apptable/flash" match into the function, which simplifies the code in 3 places. I would add to the documentation for -D a phrase to the effect of ...also disables the removal of trailing 0xff sequences from avr flash and input file reads, which can be relevant for programming via bootloaders. How about calling avr_mem_hiaddr(NULL) during parsing of -D to inform it that any subsequent calls should not optimise 0xff? That way we don't need another flag or global variable.

I am happy to do the PR. @dl8dtl?

Rationale

Bootloaders effectively are a distinct type of avr programmer ("SPM programmer"). They

Are able to carry out flash page erase
Sometimes (typically?) implement the STK500 flash pagewrite command by spm page erase, n spm page loads, spm pagewrite (optiboot does this)
Do not require chip erase for programming
Are therefore free to ignore the STK500 chip erase command (optiboot does this)
Hide some of the properties of the underlying chip, including but not limited to, the NOR memory properties
Therefore rely on being told what to program in full, including all 0xff pages and, in particular, trailing 0xff

SPM programmers have a different abstraction model than SPI programming through an intermediary device. Although they could strictly emulate STK500 programmers, in practice they don't necessarily do that.

This requires the removal of 0xff optimisation for SPM programmers. It is natural to piggy-back this onto -D, as

This is routinely used by Makefiles (eg, Sudar's Makefile), so it just works without the user realising the underlying change
Avrdude has a reasonable excuse for optimising 0xff if it had actually asked for a chip erase beforehand

Currently, stk500.c's paged_write() thankfully does not optimise away 0xff-pages in its own right. You can verify this by monitoring data exchange between the PC and the programmer, by looking at the source or by logging stk_send/recv() calls.

The only 0xff optimisation (that is relevant for bootloaders) happens in avr_read() and on input file read in fileio() through a call to avr_mem_hiaddr() that removes trailing 0xff sequences.

While the critical part is to remove the trailing-0xff optimisation in fileio(), there is utility in making -D change the avr flash readout for the discerning bootloader user: They will want a method to get the full apptable and application code including any trailing 0xffs for a download so they can use the downloaded file for later upload (for boot and the full flash this is less critical).

With this intervention, avrdude can now zap the contents of the flash below the bootloader by uploading an appropriat hex file consisting of only 0xff.

What's the effect of the suggested solution for people who do not program via a bootloader and use -D? 0xff optimisation is turned off, and some uploads last a fraction of the time longer. If that's critical in a production line, the user could remove the trailing 0xff in the input file. Non-bootloader programming would require that -D can be used reliably exactly once, perhaps after programming the bootloader. So, hopefully the downsides are limited.

mariusgreuel commented 2 years ago

A good and workable solution is to piggy-back on -D for "do not optimise 0xff"

Not sure if I missed something, but that does not sound like a robust solution. Per my understanding, Optiboot currently does not work correctly independent of whether -D is used or not. Requiring -D in order to disable the 0xFF optimization is just the next trap for people to walk in.

stefanrueger commented 2 years ago

Whether or not optiboot has the right to ignore chip erase commands is a matter of debate. In practice it works, and has millions of times. It is edge cases (with real-word use) like the one I demonstrated that don't work.

I don't know how the arduino environment flashes its sketches, but I strongly suspect it issues a -D (as is good practice for flashing chips already containing a bootloader). I know that Sudar's arduino Makefile does this.

And, if not, well at least the user has a way of remedy, and avrdude has the reasonable excuse of saying well, the chip erase was sent, wasn't it? And the -D documentation tells you to use that with bootloaders.

Disabling 0xff optimisation by default, I fear there might be some downsides for non-bootloader users who all of a sudden always have to wait the full hog for a download of the application from the chip, and I predict they will complain. That's why I suggested piggy-backing on -D.

Other than that (as someone who rarely uses an intermediate device for programming) I would also be happy to go along with another option, eg, '-d' defaulting to either on or off, or with removal of the 0xff optimisation everywhere, with its removal in fileio() only etc as long as it is addressed somehow.

stefanrueger commented 2 years ago

Obviously, nobody is going to re-flash the bootloader on millions of Arduinos.

... and it's likely that something else would have to go in optiboot if the code to erase all up to boot were to be included. How many bytes would that code take? 40-50 bytes, perhaps. Some 10% of the 512 bytes...

stefanrueger commented 2 years ago

Se we have the following choice to enable avrdude to send the full input file:

Piggy-back on -D
New option -o (enable 0xff optimisation) and default to optimisation off
New option -d (disable 0xff optimisation) and default to optimisation on
Model SPM programmers to allow for the fact that some of them are unable to carry out chip erase, but can implement a page write with implicit page erase

My ever so slight preference is for actually doing two things: 1 (immediately) and 4 (later on, after careful analysis how best to implement this abstraction layer)

Also happy with 2 and 4 or 3 and 4 (C precedence rules :)

Doing nothing is not an option for me.

mcuee commented 2 years ago

Not so sure what is the best lable for this one, I label it as "enhancement" first.

stefanrueger commented 2 years ago

Closed with PR #936

avrdudes / avrdude

Some input files are written incompletely to flash #918