ilanschnell / bitarray

efficient arrays of booleans for Python
Other
716 stars 98 forks source link

bytereverse documentation #202

Closed kitterma closed 1 year ago

kitterma commented 1 year ago

This isn't any kind of issue with the code, I think the behavior is reasonable, but I think it should be documented. If the bitarray isn't byte aligned, it looks like the array is filled to the byte boundary with ones and then reversed. The now hidden (because they are past the end of the bitarray) bits are stored, so that if the bitarray is reversed again, the original value is there:

>>> e = bitarray.bitarray('101100')
>>> e.bytereverse()
>>> print(e)
bitarray('110011')
>>> e.bytereverse()
>>> print(e)
bitarray('101100')

I'm not submitting a recommended change to the documentation, because I'm not sure what it should be. I do think there should be something in the bytereverse documentation that describes the behavior of the function for partial bytes.

kitterma commented 1 year ago

Although it looks (now that I have more testing) like sometimes the undefined bits are zero.

ilanschnell commented 1 year ago

The .bytereverse() method changes the bit order in each byte of the buffer. The method does not treat pad bits as zeros (sets them to zero). Therefore, two consecutive calls to this method does not change anything. Note that this would not be the case if the pad bits are set to zero. For example, we can call .tobytes() sets the pad bits to zero:

>>> a = bitarray('101100')
>>> a.bytereverse()  # buffer byte is 00001101 (actually ??001101)
>>> a.tobytes()   # set the two pad bits (01) to zero - the buffer is now 00001100
b'\x0c'
>>> a.bytereverse()
>>> a             # now we no longer have the initial bitarray
bitarray('001100')

Without the call to .tobytes() we two consecutive calls to .bytereverse() (as in your example) don't do anything.

kitterma commented 1 year ago

Thanks. Maybe this change to the documentation then:

For each byte in byte-range(start, stop) reverse the bit order in-place. The start and stop indices are given in terms of bytes (not bits). Also note that this method only changes the buffer; it does not change the endianness of the bitarray object. If the bitarray object length is not byte aligned, zero value pad bits are implicitly added before the bytereverse is performed, but the length of the bitarray is not changed. Bit values that were reversed beyond the end of the bitarray are stored, so a second bytereverse will restore the original values.

That's probably too long, but I'm not sure how to be less verbose.

ilanschnell commented 1 year ago

I just added the sentence "Padbits are left unchanged such that two consecutive calls will always leave the bitarray unchanged." to the docstring. I also added a test to ensure that two consecutive to .bytereverse() leave the bitarray unchanged.

kitterma commented 1 year ago

Thanks. Sounds good.

Scott K