stefanrueger / urboot

Small AVR bootloader using urprotocol
GNU General Public License v3.0
55 stars 8 forks source link

Question: Extra section between bootloader and .version section. #32

Closed mokrafoka closed 1 month ago

mokrafoka commented 1 month ago

Hi,

I'm wondering if there is an easy way to shift the boot loader .text section a bit to make room for about 5 bytes between end of boot-loader's .text and beginning of .version section?

I would like to store some application version related info in this area to be able to find and read it from a hex-dump. Storing this information before boot-loader .text wouldn't work as I don't know where it starts as I build my app code. On the other hand, writing my version info just at flashend - 10 probably won't work as there is a tight fit between .text end and .version, isn't it?

Thus I think shifting the .text down by 5 bytes would be the best option. But ideally without forking the urboot project and fiddling around with urboot-gcc implementation.

I know I can provide the addresses for the .text section manually calling avr-gcc directly, like mentioned in the Makefile:

avr-gcc -DSTART=0x7e00 -DVERSTART=0x7ffc -DRJMPWP=0xcfda -Wl,--section-start=.text=0x7e00 \
-Wl,--section-start=.version=0x7efc ...

But then I don't really understand how -DRJMPWP=0xcfda is calculated.

Could you please give me some hints? Is there something else I should watch for, to not overwrite any boot-loader related pieces?

Thanks in advance, Milosz

stefanrueger commented 1 month ago

@mokrafoka It won't work to move the bootloader a small number of bytes lower, as .text needs to start at the beginning of a flash page (the bootloader occupies an integer number of flash pages). Also note that avrdude -c urclock puts metadata just below the bootloader (that's something that AVRDUDE does and not something that urboot bootloaders do). You could select a bootloader that has 5 bytes free and extend .version with those 5 custom bytes. Check out the discussion that Issue #29 morphed into, and this comment.

mokrafoka commented 1 month ago

Hi @stefanrueger,

Thanks for the fast reply! I the problem you linked is indeed very similar to mine. However what do you mean with 'select a boot loader that has 5 bytes free'? I always build the binary myself. In that case, does it mean that I have to take care that the amount of frills and extra stuff shall not extend the boot-loader size so that flashend - .text + boot-loader-size + 5 (for .version) < 5 (for my section)?

My boot-loader is 238 Bytes large and the .text section starts at 0x7f00. Thus having an atmega328p with flashend at 0x7fff would make enough space for my 5-Byte section at flashend - 10 without disturbing urboot related pieces, right?

Why should I extend the .version section? AFAIK this would shift all the version related bytes which in turn would render avrdude -c urclock unusable. Or I would have to change the boot loader code and add extra padding bytes to the urboot_version array. I would rather refrain from touching the urboot code. In my particular case I would just add my own section 5 bytes before .version.

stefanrueger commented 1 month ago

Have you tried running hexls on the bootloaders you build? The difference between the first two numbers is the free space.

$ hexls atmega328p_amin.hex
252 256 u8.0 w-u-jPra- atmega328p_amin.hex

So this one has 4 free bytes between .text and .version. You can also check by displaying the 5 bytes before the top 6 .version bytes once you burned your bootloader:

avrdude -qq -xid=F.-11.5 -xshowid -curclock
ffffffff03

So this one is unsuitable, indeed (the least significant byte of the id comes first in memory).

You can create an id.hex file with your 5 byte number at the correct address (0x7ff5 in your case) with a separate tool, say srec_cat, and burn the original, unmodified bootloader with enough space (there must be 0xffs where you want your id) together with your id file just so:

avrdude -c ... -p m328p -U original_urboot.hex -U id.hex
mokrafoka commented 1 month ago

Have you tried running hexls on the bootloaders you build? The difference between the first two numbers is the free space.

No I didn't. But now I see this is a very handy tool ;-) Id didn't even notice that the same output is displayed every time I build a boot-loader.

Thank you a lot. Now I have all the info I need. -Milosz

mokrafoka commented 1 month ago

Sorry,

have to open again. It seems avrdude -c urclock keeps the boot-loader safe and doesn't allow to write in this location.

avr-objcopy -O ihex -j .appversion sleepyClient32kHz.elf appversion.hex
avrdude -p atmega328p -c urclock -b9600  -P /dev/ttyUSB0 -V -F   -U flash:w:appversion.hex:a
...
avrdude error: input [0x7ff5, 0x7ff9] overlaps bootloader [0x7f00, 0x7fff]
avrdude done.  Thank you.

Where .appversion is my app-version section at 0x7ff5. A pretty same error occures if I try to write the .appversion together with the rest of my application code. I know I can write it together with the boot-loader via ISP, but then I would have to burn the boot-loader every time I flash my app. For obvious reasons I don't want to to this.

It seems to me there is no (x)option in avrdude to disable this check, right? Is there another possibility to workaround this problem except of setting PROTECTME=0 as make option for the boot-loader (and probably setting PGMWRITEPAGE=0 just to be sure I don't go crazy)?

mokrafoka commented 1 month ago

Hehe, I just realized that PROTECTME is disabled in urboot v7.7. Anyway, I replaced the 1 here:

#undef  PROTECTME               // From v7.7 enforce bootloader self-protection
#define PROTECTME             0

and

-#define UR_TYPE (UR_AFLAG | UR_EFLAG | UR_DFLAG | UR_VFLAGS | UR_UFLAG | UR_QFLAG | UR_CFLAG)
+#define UR_TYPE (UR_AFLAG | UR_EFLAG | UR_DFLAG | UR_VFLAGS | UR_UFLAG | UR_PFLAG | UR_CFLAG)

And rebuild the urboot. Now the boot-loader gets bigger (huh?), and my section wouldn't fit anymore. However adding some frills extended the size even more in a way that .text has been shifted down and I got some free space. But whatever I do, I always end up with something like this: w-u-jPra- or w-u-jpra- Double-checked it trying to flash my extra section and urclock really refuses to overwrite the boot loader. Update: it seems that the 'p|P' is hadcoded in hexls if the urboot version is 7.7.

Is there anything else I could do to disable the overwrite protection?

stefanrueger commented 1 month ago

every time I flash my app

I had not realised you want an app-specific 5 bytes that need to be upload/changed with every new sketch you burn. I understand you want those numbers at a known, fixed address. A good place for these is between the vector table and the code of the application. Include the following in your C/C++ source code:

uint8_t __attribute__((used)) __attribute__((section(".vectors")))
  my_id[5] = {0x10, 0xe1, 0x1c, 0xca, 0xf0};

These numbers will be in flash after the vector table. The m328p has 26 interrupt vectors, 4 bytes each, so you will see these 5 bytes at flash address 104. This will actually cost you 6 bytes as the compiler/linker need code to start at even addresses and they pad the 5 bytes. You will see your number externally through

$ avrdude -qq -c urclock -xshowid -xid=F.104.5
f0ca1ce110

Again the printed id is little endian (ie, high byte is the last of the sequence). Different MCUs have different vector table sizes, but the address is the same for the same MCU.

This technique does not need to faff around with the bootloader. Bootloaders should not overwrite themselves - not good practice and this is bound to leave the project in tears if you try (bricked parts and some such: bricked in the sense that you need to reprogram them with an ISP programmer). The unused space in a bootloader is only ever good for a one-time board-specific id burned at the same time as the bootloader. Bootloader space is defo not suited for data that change with the uploaded apps.

stefanrueger commented 1 month ago

'p|P' is hadcoded in hexls if the urboot version is 7.7

How could hexls know that a different program than urboot.c v7.7 generated the bootloader?

mokrafoka commented 1 month ago

Well, correct me if I'm wrong but:

$fea .= f(!$v77? ($vers & UR_PROTECTME? 'p': '-'): ($vers & UR_PROTECTME? 'P': 'p'));

seems to me that if hexls detects urboot the output is either 'p' or '-' for all versions <7.7 and and 'P' or 'p' for versions >=7.7. (perl is not my strong point, though). Of course, if the input is not urboot hex file then hexls doesn't show anything except of size.

I had a look into the avrdude code and it looks like urclock implementation refuses to flash if the app code overlaps with boot loader. Since in my case the boot loader is in between [0x7f00, 0x7fff], everything I try to write there is blocked, like my .appversion at 0x7ff5. So the PROTECTME bit isn't evaluated anyway.

I just tried something like this:

avr-objcopy -O ihex -j .appversion sleepyClient32kHz.elf appversion.hex
avrdude -p atmega328p -c urclock -b9600  -P /dev/ttyUSB0 -V   -U flash:w:appversion.hex:a -D -xnometadata

avrdude: reading input file appversion.hex for flash
         with 5 bytes in 1 section within [0x7ff5, 0x7ff9]
         using 1 page and 123 pad bytes
avrdude: preparing flash input for device bootloader
avrdude: writing 5 bytes flash ...
Writing | ################################################## | 100% 0.45 s 
avrdude: 5 bytes of flash written

After I removed the checks in urclock code. It looks like the 5 bytes are written, but in the end the data ins't written at all:

./avrdude -p atmega328p -c urclock -b9600  -P /dev/ttyUSB0 -xid=F.-11.5 -xshowid -qq
ffffffffff

And here I cannot wrap my head around. I'll try to figure it out tomorrow...

Btw. @stefanrueger have you ever tried to write some extra bytes between the end of boot loader code and start of the .version section? If yes, how did you do it?

-Milosz

mokrafoka commented 1 month ago

Ops, I've overseen your post about the 'after vector table' placement. Sorry it's late here, I'll try it out tomorrow. Thanks for the tip!

mokrafoka commented 1 month ago

A good place for these is between the vector table and the code of the application.

That's it. For some reason I was biased by the FLASHEND that I didn't even consider the beginning of FLASH. The implementation works as expected and I think now I really have what I need.

Thanks a lot for the support :-)

-Milosz

stefanrueger commented 1 month ago

FLASHEND

Yes, it's tempting to want to use the unused bytes in the bootloader, BUT that's only possible once at burn time of the bootloader. Later, in operation, the bootloaders have to erase a full page before writing it, and guess what happens when that page is the bootloader code that's currently running? Yes, the bootloader now has a gaping hole, a large part of its code is gone, and it is effectively bricked. This is why avrdude -c urclock is at pains to not send data to a bootloader that would overwrite it. Urboot protects itself, but optiboot would happily overwrite itself. Try with avrdude -arduino if you want to have fun. Now, urboot exports a function pgm_write_page() that user applications can utilise to overwrite flash. This is the main reason why urboot wants to protect itself from being overwritten by user application calls to pgm_write_page(). Bricking a bootloader is no fun if the part needs to be soldered out of the PCB in order to access the SPI signals.

One could still write just below the bootloader if the application does not fill all free space. avrdude -c urclock does this to place metadata there (in absence of -c nometadata) to store file name, date and where the application ends and how much free space there is between the end of the application and the metadata area. I call this unused space store. As the top 6 bytes of the bootloader encode how big the bootloader itself is, each application can figure out where the metadata sit and, as a consequence, the start of store and its end. Here is a demo application that shows how this infrastructure can be used to utilise unused space between the application and the metadata.

BTW, if your 5 bytes application version are ASCII and if they only need to be read by humans, then you could consider storing the version string instead of the filename in the metadata using -x title=v3.14 for example. You would see that by

$ avrdude -qq -x showfilename
v3.14

But yes, using storage at the end of the vector table is even simpler, but you have to watch out for changes in the table size.

mokrafoka commented 1 month ago

Yes, it's tempting to want to use the unused bytes in the bootloader, BUT that's only possible once at burn time of the bootloader. Later, in operation, the bootloaders have to erase a full page before writing it, and guess what happens when that page is the bootloader code that's currently running? Yes, the bootloader now has a gaping hole, a large part of its code is gone, and it is effectively bricked. This is why avrdude -c urclock is at pains to not send data to a bootloader that would overwrite it. Urboot protects itself, but optiboot would happily overwrite itself. Try with avrdude -arduino if you want to have fun. Now, urboot exports a function pgm_write_page() that user applications can utilise to overwrite flash. This is the main reason why urboot wants to protect itself from being overwritten by user application calls to pgm_write_page(). Bricking a bootloader is no fun if the part needs to be soldered out of the PCB in order to access the SPI signals.

Correct. I have to admit that I didn't read the boot loader section in the data sheet before I started messing around with the application version. In the end SPM_PAGESIZE is the main blocker I just wasn't aware of.

One could still write just below the bootloader if the application does not fill all free space. avrdude -c urclock does this to place metadata there (in absence of -c nometadata) to store file name, date and where the application ends and how much free space there is between the end of the application and the metadata area. I call this unused space store. As the top 6 bytes of the bootloader encode how big the bootloader itself is, each application can figure out where the metadata sit and, as a consequence, the start of store and its end. Here is a demo application that shows how this infrastructure can be used to utilise unused space between the application and the metadata.

Sure this would be possible as well and in fact I was thinking about it as well as I realized the gap between boot loader end and boot loader version isn't accessible for me. However after your tip regarding end of vector table placement I dropped the Idea very quickly ;-)

BTW, if your 5 bytes application version are ASCII and if they only need to be read by humans, then you could consider storing the version string instead of the filename in the metadata using -x title=v3.14 for example.

Yeah, about something similar I was thinking about as well. However the complexity grows (in my case I wasn't aware of urbootNmetabytes and therefore I would calculate it by myself) and even if this would be done only once in the application runtime the needed ROM would be wasted. So in this case I was much more interested in the idea above.

But then you mentioned the vector table and, puff... all the problems magically vanished ;-) Of course I hope I will never need to adapt the vector table sizes, for whatever weird reasons I currently cannot foresee...

Thanks again!

stefanrueger commented 1 month ago

The vector table size varies from MCU to MCU but is a constant once the MCU is fixed. In a C/C++ application you can use _VECTORS_SIZE instead of 104. If you then switch the MCU (say, from an ATmega328P to an ATmega328PB that has a bigger vector table) the location is correct.

From a Makefile, the command line etc, you need to know which MCU has been soldered onto the board and its vector table size.

Urboot can use this trick to extend the vector table by one entry to avoid that the vector used for vector bootloaders be no longer available for any application using the bootloader. In that case urboot would have to been compiled with VBL_VECT_NUM=-1, see the documentation for this option. But if you've done this you know about it ;) Urboot by default chooses a vector that applications are vvv unlikely to ever need, eg, SPM_READY (only code in bootloader space can use the SPM opcode), so doesn't by default extend the vector table.

In theory a compiler could figure out what the highest used interrupt vector number is and curtail the vector table from that number in order to utilise the unused vector table space for the application. I have never seen a compiler that does that, though.

So, summarising, yes, vector table extension is a nifty trick.

mokrafoka commented 1 month ago

The vector table size varies from MCU to MCU but is a constant once the MCU is fixed. In a C/C++ application you can use _VECTORS_SIZE instead of 104. If you then switch the MCU (say, from an ATmega328P to an ATmega328PB that has a bigger vector table) the location is correct.

From a Makefile, the command line etc, you need to know which MCU has been soldered onto the board and its vector table size.

Using .vectors as a destination section I don't even have to bother with makefile as the section is already defined by avr-gcc tool-chain. I have to know it when reading back the data (from a hex file or directly using avrdude). And in fact my current setup looks something like this:

mcuName="$1"
avrGccPath="$2"

# use system intallation if no explicit path has been provied.
[[ -z "$avrGccPath" ]] && avrGccPath=avr-gcc

cat <<EOF | ${avrGccPath} -mmcu=${mcuName} -o /tmp/vectorSize.o -xc -c - 2>&1 | sed "s@.*[']\?#pragma message: (\(.*\))[']\?@\1@g" | bc | xargs printf "0x%x\n"
#include <avr/io.h>
#define XSTR(x) SSTR(x)
#define SSTR(x) #x
#pragma message XSTR(_VECTORS_SIZE)
EOF

The output is then used to drive either avrdude -c urclock ... -t "dump flash ${vtoffset} ${size}" or screc_cat -crop .... Sure the knowledge about the mcu type used must be present, but IMHO this is obvious. And since I never used arduino and don't plan to, me and every user of my applications will have to be aware of that ;-) And of course the device has to be supported by the avr-libc, which doesn't seem the case for atmega328pb (at least not without manual tweaking).

... In that case urboot would have to been compiled with VBL_VECT_NUM=-1, see the documentation for this option. But if you've done this you know about it ;)

In the past I had to deal with a crude clone of the 328p which did have an additional timer. However to make use of this extra timer I had to hack the vector-table as well. In the end it worked, but something like that usually introduces lots of other problems usually I like to avoid. That said, I hope to never be forced to rely on SPM_READY and thus lave the vector table as expected by urboot by default. OTOH, if I ever will have to rely on SPM_READY then a fixed boot block size is probably going to be a better option.

stefanrueger commented 1 month ago

Thanks for sharing the #pragma trick of the trade. I've so far used gcc -E to get expanded macros as known to gcc but then had to use a type of eval() function to collapse the expressions to a number. #pragma message seems to do that for you. Neat.

mokrafoka commented 1 month ago

Thanks for sharing the #pragma trick of the trade. I've so far used gcc -E to get expanded macros as known to gcc but then had to use a type of eval() function to collapse the expressions to a number. #pragma message seems to do that for you. Neat.

You're welcome ;-)

However I just noticed it's wrong. It should be:

-cat <<EOF | ${avrGccPath} -mmcu=${mcuName} -o /tmp/vectorSize.o -xc -c - 2>&1 | sed "s@.*[']\?#pragma message: (\(.*\))[']\?@\1@g" | bc -l | xargs printf "0x%x\n"
+cat <<EOF | ${avrGccPath} -mmcu=${mcuName} -o /tmp/vectorSize.o -xc -c - 2>&1 | sed "s@.*[']\?#pragma message: \(.*\)[']@\1@g" | { echo $(( $(cat) )); } | xargs printf "0x%x\n"

Because, depending on the selected mcu, some of the macro entries are enclosed in round bracket and sometimes thy aren't. So the gcc output must be evaluated before it's printed.

Another possibility would be this:

cat <<EOF | ${avrGccPath} -mmcu=${mcuName} -o /tmp/vectorSize.o -xc++ -std=c++11 -c - 2>&1 | tail -1 |  sed -e "s@.*'(\(.*\) == 0)'@\1@g" | xargs printf "0x%x\n"
#include <avr/io.h>
static_assert(_VECTORS_SIZE == 0,"");
EOF

But note that it works only for newer compilers. For me it didn't work until gcc-12 as the older versions don't print the evaluated const_expr parameter in static asserts.

Update: even shorter and faster, but doesn't work with gcc-5.4 and older:

cat <<EOF | ${avrGccPath} -mmcu=${mcuName} -o /tmp/vectorSize.o -xc++ -c - 2>&1 | tail -1 | sed "s@.*error: size '-\(.*\)' of .*@\1@g" | xargs printf "0x%x\n"
#include <avr/io.h>
int dummy[_VECTORS_SIZE * -1];
EOF

Both last proposals have the huge advantage over the first one that they do provide a valid output even if _VECTORS_SIZE is only a reference to another macro.