use reinterpret_cast uint8_t[] to uint16_t/uint32_t

Palatis commented 7 years ago

instead of shifting and or-ing the bits. this should gives better performance and less opcodes required.

breaks previous saved data, since avr is big-endian system, low bytes are stored first.

Palatis commented 7 years ago

reinterpret_cast<> will provide faster code, plus,

AVR is basically BE architecture, the current implementation is actually having byte order wrong, resulting weird bit-byte order.

LE machine (higher bits -> lower bits)
[  byte 3  ][  byte 2  ][  byte 1  ][  byte 0  ]
[ 76543210 ][ 76543210 ][ 76543210 ][ 76543210 ]

BE machine (lower bits -> higher bits)
[  byte 0  ][  byte 1  ][  byte 2  ][  byte 3  ]
[ 01234567 ][ 01234567 ][ 01234567 ][ 01234567 ]

current (mix of the two...)
[  byte 3  ][  byte 2  ][  byte 1  ][  byte 0  ] // LE format
[ 01234567 ][ 01234567 ][ 01234567 ][ 01234567 ] // BE format

i think "correcting" them as soon as possible is a good idea, anyway......

Palatis commented 7 years ago

actually, i had a write-through cache for a small portion on the FRAM. usually I read from the memory cache, but write-through the FRAM to make sure that I can restore the cache on boot. I declared a class similar to this:

class cache_layout {
private:
  uint32_t _foo[42];
  uint8_t _crc8;

public:
  void begin() {
    fram.readArray(CONF_ADDR, sizeof(*this), reinterpret_cast<uint8_t *>(this));
    uint8_t calculated_crc8 = _calculate_crc(reinterpret_cast<uint8_t *>(this), sizeof(*this) - 1);
    if (_crc8 != calculated_crc8)
      _init();
  }

  uint32_t getFoo(uint8_t const index) { return _foo[index]; }
  void setFoo(uint8_t const index, uint32_t const value) {
    _foo[index] = value;
    _crc8 = _calculate_crc(reinterpret_cast<uint8_t *>(this), sizeof(*this) - 1);

    fram.writeLong(
      CONF_ADDR + (reinterpret_cast<uint8_t *>(&(_foo[index])) - reinterpret_cast<uint8_t *>(this)),
      _foo[index]
    );
    fram.writeByte(
      CONF_ADDR + (reinterpret_cast<uint8_t *>(&_crc8) - reinterpret_cast<uint8_t *>(this)),
      _crc8
    );
  }

private:
  uint8_t _calculate_crc(uint8_t const * const buffer, uint8_t const length) { /* ... */ }
  void _init() { /* ... */ }
};

as you can see, I only attempt to write sizeof(_foo) + sizeof(_crc8) instead of sizeof(cache_layout) which saves a lot of time since most of the contents remained the same.

especially that I also do dual banking to ensure data integrity even when power-off during write access.

having weird byte-order in the lib limits the ability from directly dump the FRAM content back from using readArray(), and have to break them into readLong(), readWord(), and readByte() on individual data members, which is MUCH slower than one single readArray().

a solution with an FRAM ic integrated is usually more paranoid on security and has less room for tolerances. a faster dump means less system overhead so I can run my business logic sooner, which is always welcome.

Palatis commented 7 years ago

avoiding memory copy is another good idea: https://github.com/Palatis/FRAM_MB85RC_I2C/commit/a2c80db113bf9f7d1defb73e17c1bb3bec3b5b86

Palatis commented 7 years ago

reinterpret_cast<> has another benefit, we can now have

template <typename T>
uint8_t readFrom(uint16_t const address, T & out) {
  return readArray(address, sizeof(data), reinterpret_cast<uint8_t * const>(&data));
}

template <typename T>
uint8_t writeTo(uint16_t const address, T const & data) {
  return writeArray(address, sizeof(data), reinterpret_cast<uint8_t const * const>(&data));
}

to read() / write() all types of data, not just limitted to uint32_t and uint16_t.

sosandroid / FRAM_MB85RC_I2C

use reinterpret_cast uint8_t[] to uint16_t/uint32_t #6