Which parts have a 22-bit PC?

stefanrueger commented 1 year ago

It is clear that parts with flash above 128 kiBi need more than 16 bits for their PC. What isn't clear is whether lower-spec'd parts of the same architecture also have a 22-bit PC: for example, a m1280 might well be a m2560 reject with flash errors in the upper half where an undocumented fuse bit limits flash to 128 kiBi.

Why is this relevant? Urboot's SWIO routine can produce bit delays matching a given number of CPU cycles (within reason) but it needs to know whether the part has a 22-bit PC or a 16-bit PC as the respective timings of the rcall and ret opcodes differ.

Microchip's AVR instruction set manual does not say which parts have 22-bit PCs, but it asserts that only parts with 22-bit PC implement the eicall that uses the EIND register to extend the PC. Now, ATDF and avr-libc differ in their opinion whether a part has the EIND register, see the table below. That's the closest allusion I found in the documentation as to which parts might have a 22-bit PC.

If you have one of the parts below, @mcuee @MCUdude @SpenceKonde @dl8dtl, please could you run the following function and edit the table below whether it returns 3 (ie, 22-bit PC) or 2 (ie, 16-bit PC).

uint8_t spwidth() {
  uint8_t ret;

  asm(
    "  rcall 1f\n"
    "  in %[ret], %[spl]\n"
    "  sub %[ret], r0\n"
    "  rjmp 2f\n"
    "1: in r0, %[spl]\n"
    "   ret\n"
    "2:\n"
    : [ret] "=r"(ret): [spl] "I"(_SFR_IO_ADDR(SPL)) : "r0"
  );
  return ret;
}

Colums 2-4 show whether the part has an EIND register. [Edited after discussion]

Part	ATDF	avr-libc	DS	Flash size	22-bit PC	Comment
ATmega640	✔	✔	✔	0x10000	❌	Error in ATDF, avr-libc and DS
ATmega1280	✔	✔	✔	0x20000	❌	Error in ATDF, avr-libc and DS
ATmega1281	❌	✔	✔	0x20000	❌	Error in avr-libc and DS
ATmega2560	✔	✔	✔	0x40000	✔	Consistent, sane and good
ATmega2561	✔	✔	✔	0x40000	✔	Consistent, sane and good
ATmega256RFR2	✔	✔	✔	0x40000	✔	Consistent, sane and good
ATmega2564RFR2	✔	✔	✔	0x40000	✔	Consistent, sane and good
ATmega8U2	✔	✔	❌	0x02000	❌	Error in ATDF and avr-libc
ATmega16U2	✔	✔	❌	0x04000	❌	Error in ATDF and avr-libc
ATmega16U4	✔	✔	❌	0x04000	❌	Error in ATDF and avr-libc
ATmega32U2	✔	✔	❌	0x08000	❌	Error in ATDF and avr-libc
ATmega32U4	✔	✔	❌	0x08000	❌	Error in ATDF and ar-libc
AT90USB82	✔	❌	❌	0x02000	❌	Error in ATDF
AT90USB162	✔	❌	❌	0x04000	❌	Error in ATDF
AT90USB646	✔	❌	❌	0x10000	❌	Error in ATDF
AT90USB647	✔	❌	❌	0x10000	❌	Error in ATDF
AT90USB1286	✔	❌	❌	0x20000	❌	Error in ATDF
AT90USB1287	✔	❌	❌	0x20000	❌	Error in ATDF

mcuee commented 1 year ago

I think the datasheet should be able to tell the info.

1) Section 8.1 of the following datasheet says The ATmega640/1280/1281/2560/2561 Program Counter (PC) is 15/16/17 bits wide, thus addressing the 32K/64K/128K program memory locations. And as per the datasheet Section 7.6.2, they have EIND register.

I think that means ATmega640/1280/1281 have 16bit PC and have EIND.

https://ww1.microchip.com/downloads/aemDocuments/documents/OTH/ProductDocuments/DataSheets/ATmega640-1280-1281-2560-2561-Datasheet-DS40002211A.pdf

2) ATmega16U4/32U4 have 16bit PC and no EIND. https://ww1.microchip.com/downloads/en/devicedoc/atmel-7766-8-bit-avr-atmega16u4-32u4_datasheet.pdf

3) ATmega8U2/16U2/32U2 have 16bit PC and no EIND. https://ww1.microchip.com/downloads/en/DeviceDoc/doc7799.pdf

4) AT90USB82/162 have 16bit PC and no EIND. http://ww1.microchip.com/downloads/en/devicedoc/doc7707.pdf

5) AT90USB646/647/1286/1287 -- 16bit PC but ?? on EIND I can not find official datasheet from Microchip website. The following datasheet is not clear whether they have EIND register or not. Section 33 (register summary) does not metion EIND but Section 34 mentions EIND. https://www.mouser.sg/datasheet/2/268/doc7593-1369068.pdf

stefanrueger commented 1 year ago

I think that means ATmega640/1280/1281 have 16bit PC and have EIND

Thanks, @mcuee. I don't interpret the data sheet in the same sure-footed way. There is no need to implement EIND and eicall/eijmp if the silicon implementation of the logical 15/16 bit PC is actually a two-byte PC. It could still be that the silicon implementation of rcall is still the same between m1280 and m2560 (push 3 bytes on stack and call the subroutine) just that the third byte is always 0.

I would feel much more comfortable if there was an experimental verification. Do you have one of the parts (eg, m1280, m1281, m640) and can run the function above?

dl8dtl commented 1 year ago

The PC is internally indeed "capped" correctly. Regardless of whether the variants are created by the same die, or are actually different silicon, there are internal OTP fuses that trigger certain decisions in the digital code. The actual size of the PC is one of these decisions. You can easily see this even in the compiler: for PCs that are larger than 16 bits, the stack uses one more byte for each CALL (to store the return PC), and the compiler has to cope with that when performing stack parameter offset calculations. You can see there that it properly relies that only devices with more than 128 KiB use three bytes on the stack. ;-) Compile this:

#include <stdarg.h>

extern void doit(int, va_list);

void dosomething(const char *cmd, ...) {
        va_list ap;
        va_start(ap, cmd);

        switch (cmd[0]) {
        case 'A':
                doit(0, ap);
                break;

        default:
                doit(42, ap);
                break;
        }
        va_end(ap);
}

once for an ATmega1281, and once for an ATmega2561, and you'll see the difference.

stefanrueger commented 1 year ago

stack uses one more byte for each CALL (to store the return PC)

Yes, that's what my little function spwidth() utilises to return 2 for a 16-bit PC and 3 for a 22-bit PC. Lacking an ATmega1281 I cannot run that function, though!

@dl8dtl Neat idea to ask the compiler as a proxy whether it thinks the PC is two or three bytes, but your example compiles the same with an error message undefined reference to 'doit'. It does not (as I hoped) give me a compile-time answer of my question. Needs to be compile-time as I don't have the part.

dl8dtl commented 1 year ago

Just compile it to assembly source code only (-Os -S -mmcu=xxx). Here's the respective diff output between both generated assembly files:

% diff -u m1281.s m2561.s 
--- m1281.s     2023-04-11 16:08:15.037631000 +0200
+++ m2561.s     2023-04-11 16:08:08.125990000 +0200
@@ -18,7 +18,7 @@
 /* stack size = 2 */
 .L__stack_usage = 2
        movw r30,r28
-       adiw r30,5
+       adiw r30,6
        ld r26,Z+
        ld r27,Z+
        ld r24,X

stefanrueger commented 1 year ago

@dl8dtl OK, very good. I see, the compiler uses the actual flash size to determine whether a part implements a 2-byte or 3-byte PC in silicon. If the compiler were to get this wrong, then function calls would get stack frames wrong for passing parameters on the stack, and someone would have noticed in the last 20 years or so.

What about the EIND business? Is it cool for avr-libc or the ATDF files to pretend there is an EIND register when in fact there isn't any? User code might do unnecessary/wrong things thinking they might need to manage EIND or thinking EIND is synonymous with FLASHEND > 0x20000

mcuee commented 1 year ago

I would feel much more comfortable if there was an experimental verification. Do you have one of the parts (eg, m1280, m1281, m640) and can run the function above?

@stefanrueger

I do not have these parts. I am not sure if @MCUdude has them or not.

But I guess this is not necessary any more, right?

MCUdude commented 1 year ago

I have an ATmega1281 I can test with

mcuee commented 1 year ago

Even though I do not have the parts, but I use Simulator inside Microchip Studio and here are the results.

ATmega1281 --> r =2 M1281

ATmega2561 --> r =3 M2561

mcuee commented 1 year ago

What about the EIND business? Is it cool for avr-libc or the ATDF files to pretend there is an EIND register when in fact there isn't any? User code might do unnecessary/wrong things thinking they might need to manage EIND or thinking EIND is synonymous with FLASHEND > 0x20000

@stefanrueger

Do you have a code to test the existence of EIND?

stefanrueger commented 1 year ago

@mcuee What a cool way to find out! And thanks for the confirmation. I am now convinced by @dl8dtl's compilation trick and by your simulator outcome that smaller parts don't use unnecessarily wide PCs.

code to test the existence of EIND

By convention it's a preprocessor "#define", so it would be #ifdef EIND. That's how I generated the table entries in the first post. It's the memory address of the EIND register. The eijmp/eicall opcodes use it, but for small parts the effect wouldn't be visible even if eijmp/eicall was implemented b/c AVR flash that has a power-of-2 size wraps round.

stefanrueger commented 1 year ago

Just released version u7.7 of urboot - thanks for helping out for this issue @mcuee @MCUdude @dl8dtl

stefanrueger / urboot

Which parts have a 22-bit PC? #23