Truncating access for `etl::bitset`

jaskij commented 11 months ago

I find myself with a simple need: to extract a single byte from a fairly large etl::bitset. One big enough that I can not comfortably convert it to an integral variable.

While there are probably workarounds using a bitset while using uint8_t as the underlying buffer, and using etl::bitset::span(), I would prefer a more ergonomic API.

A possible option would be to have a value_truncated<typename T>(), which would return sizeof(T) * CHAR_BIT lowest bits of the bitset. Looking at the existing value<typename T> code, it would probably look something like this:

template <typename T>
ETL_CONSTEXPR14
typename etl::enable_if<etl::is_integral<T>::value, T>::type
  value_truncating(const_pointer pbuffer) const ETL_NOEXCEPT
{
  const size_t COUNT = sizeof(T) * CHAR_BIT;

  T v = T(0);
  uint_least8_t shift = 0U;

  for (size_t i = 0UL; i < COUNT; ++i)
  {
    v |= T(typename etl::make_unsigned<T>::type(pbuffer[i]) << shift);
    shift += uint_least8_t(Bits_Per_Element);
  }

  return v;
}

Combined with existing bitshift operators, this would allow nice, optimized, access to specific parts of a larger bitset.

jwellbelove commented 11 months ago

I'll look into that. Would you expect integrals to be extracted at byte offsets only, or arbitrary offsets into the bitset?

jwellbelove commented 11 months ago

e.g.

etl::bitset<256, uint32_t> bits;
int8_t c = bits.value<int8_t>(19); // Get an int8_t representation of the 8 bits from bit 19 to 26.

jaskij commented 11 months ago

Would you expect integrals to be extracted at byte offsets only, or arbitrary offsets into the bitset?

The use case it came up in was Modbus coil/discrete encoding/decoding, which actually needs arbitrary indices.

And would a byte boundary be meaningful? The one meaningful boundary I can think of is width of the interior storage type (so four-byte in your uint32_t storage example). When backing the bitset with uint32_t, an 8-bit boundary seems as meaningful as a 9-bit one.

Also: thank you for maintaining this library, and for reacting quickly to issues.

jwellbelove commented 11 months ago

Byte boundaries could result in more optimal code.

I'm going to be away for the next three weeks (climbing in Spain) so I may not be able to do much work on this until I get home again.

jaskij commented 11 months ago

Byte boundaries could result in more optimal code.

Since checking for byte boundaries is very cheap, maybe it would be worth the effort of writing both versions, and using one or the other at runtime? OTOH, that increases code size. Decisions, decisions.

I'm going to be away for the next three weeks (climbing in Spain) so I may not be able to do much work on this until I get home again.

Enjoy your vacation! That's more important.

jwellbelove commented 10 months ago

I'm experimenting with a solution for this. There are two new member functions.

template <typename T>
ETL_CONSTEXPR14
T extract(size_t position, size_t length) const

template <typename T, size_t Position, size_t Length>
ETL_CONSTEXPR14
T extract() const ETL_NOEXCEPT

T must be an integral type. position/Position is the index of the least significant bit and length/Length is the number of bits required.

e.g. A bitset containing the bit pattern 0x12345678 Extract an 8 bit uint8_t with the lsb at index 11.

00010010001101000101011001111000
.............^......^

extract<uint8_t>(11, 8) == 0x8A

jaskij commented 10 months ago

Looking at the API, would you consider defaulting length to sizeof(T) * CHAR_BIT? It seems like it would be a good default, at least for the use case I've mentioned.

Also, it would probably be better to actually express the integral constraint in code itself, although it would probably require adding a new macro.

jwellbelove commented 10 months ago

The default length is a good idea.

The integral constraint is built-in with an enable_if. I didn't show it for the sake of clarity.

jaskij commented 10 months ago

The API looks good to me then.

I thought a little about extraction to, say, std::output_iterator or something, but in the end it unnecessarily expands the API surface for no real gain - there isn't much difference between looping inside ETL and the user looping themselves.

jwellbelove commented 10 months ago

BTW rather than using the C style sizeof(T) * CHAR_BIT there is etl::integral_limits<T>::bits available.

jwellbelove commented 10 months ago

I now have a working version of this. The naïve implementation is pretty simple (build the value by iterating through the bitset, testing each bit individually), but I've spent some time creating a version that extracts the data optimally. There is currently a certain amount of duplicate code in the specialised classes, so the next step is to gather the common functionality into a common base class/static functions.

jaskij commented 10 months ago

Great! Thank you!

jwellbelove commented 5 months ago

Added 20.38.11

ETLCPP / etl

Truncating access for `etl::bitset` #774