arduino / ArduinoCore-avr

The Official Arduino AVR core
https://www.arduino.cc
1.23k stars 1.05k forks source link

Unnecessary bloat in EEPROM class #452

Open BoannWasHere opened 2 years ago

BoannWasHere commented 2 years ago

Every time EEPROM.get or EEPROM.put is called with a different type it expands to another specialized inlined version of these fussy loops:

template< typename T > T &get( int idx, T &t ){
    EEPtr e = idx;
    uint8_t *ptr = (uint8_t*) &t;
    for( int count = sizeof(T) ; count ; --count, ++e )  *ptr++ = *e;
    return t;
}

template< typename T > const T &put( int idx, const T &t ){
    EEPtr e = idx;
    const uint8_t *ptr = (const uint8_t*) &t;
    for( int count = sizeof(T) ; count ; --count, ++e )  (*e).update( *ptr++ );
    return t;
}

Simpler and more efficient to defer to the universal standard library functions in avr/eeprom.h:

template< typename T > T &get( int idx, T &t ){
    eeprom_read_block(&t, (void*)idx, sizeof(T));
    return t;
}

template< typename T > const T &put( int idx, const T &t ){
    eeprom_update_block(&t, (void*)idx, sizeof(T));
    return t;
}
bxparks commented 2 years ago

With regards to the duplicated template specialization for each type, that was my thought too when I first saw this code. But if you look at the disassembled machine code, the compiler is smart and seems to de-dupe the duplicated specialization.

With regards to eeprom_read_block() and eeprom_update_block(), I see some mention on the internets that these weren't available in early versions of libc-avr.

If you compare the assembly code for the 2 versions of put(), I see some hints that the compiler has more difficulty with optimizing away the seemingly over-engineered EEPtr and EERef abstractions, rather than de-duping the for-loops across different template specializations. But I'm not an expert at reading AVR assembly, so not entirely sure. The difference is only a few dozen bytes of flash, so few people will probably notice. The CPU performance is not an issue since all these functions are dominated by the EEPROM performance.

My impression is that if you really care about this level of optimization, you can go ahead and directly use the libc-avr functions. Portability across different microcontroller platforms will be an issue, but you have to deal with that anyway since there are at least 2 different EEPROM APIs with slightly different semantics (AVR-derived versus ESP-derived).

BoannWasHere commented 2 years ago

the compiler is smart and seems to de-dupe the duplicated specialization

Only for types which are exactly the same size, and even in cases where it can do this, it doesn't beat the eeprom.h functions.

I see some mention on the internets that these weren't available in early versions of libc-avr.

And now they are.