Closed aeddins-ibm closed 3 days ago
Strictly, this isn't "endianness", it's the bit-ordering convention (endianness is byte order). Fwiw (I'm not on the primitives team), I think that the terminology here corresponds to that of numpy.packbits
.
Thanks, I didn't know that. I had commonly heard the bit-ordering referred to as endianness (like here, here). I just edited the above to not use the word "endianness" and will try to clean up my terminology going forward.
Terminology aside, I think (?) there is still an error in the docstring with whether "big" or "small" would have the MSB at index array[..., 0]
.
As Jake pointed out, the order
here seems to match how numpy's packbits()
orders bits. From the numpy docs:
If I'm reading this right, "big" here puts the MSB at position 0 (LSB at the end). So AFAICT order="big"
has the opposite usage of big in "big endian". This seems like an unfortunate clash of conventions that will confuse people (clearly it confused me), but I see it's probably better to match the packbits
convention here, so the docstring is then correct. Closing this.
Perhaps at some point we could update the kwarg to be less ambiguous, like MSBF = True/False
for "most significant bit first".
Yeah, we've often (and probably still do...) use "endianness" to refer to bit-ordering conventions in Qiskit, but it's not strictly correct terminology to do so.
I believe the terminology here is still roughly consistent with the formal definition of endianness. Big endian stores the most-significant byte at the lowest memory address (offset 0), then the next most at position 1, etc. Little endian stores the least significant byte at memory offset 0 and so on.
Where this gets confusing is that within a byte, we always write bitstrings with the most-significant bit on the left (which is index 0 if we use a programming string that will print the way we read it), but our intuition from mathematical place value says that the "index" $n$ of each bit should refer to the value by $2^n$, which implies a bit-labelling with the least-significant bit labelled as 0 (so the right-most one). My rough feeling is that the difference between "how we read it" and "how we typically label things from left to right" is where people tend to mix the terms (I know I used to too).
So practically, if bstring = '01011'
is a key in a counts
dictionary, then "little endian" (if we allow describing bit ordering this way) still means the little bit (LSB) is at the end of bstring
(bstring[-1]
).
But bstring
is then understood to be reversed with respect to the order of the bits in each byte, i.e. the bit at memory address n
in a byte corresponds to bstring[::-1][n]
(assuming just 1 byte). And this matters for bitpacking.
Thanks for the explanations. Someday I will understand this with 0 sign errors, instead of just with an even number of sign errors...
Environment
What is happening?
https://github.com/Qiskit/qiskit/blob/25c054251f50871ff9ad1dd5d3a7f2de2b2436fc/qiskit/primitives/containers/bit_array.py#L189-L198
Normally, I'd expect "big" would mean element 0 is the least significant bit, and element -1 (the end) is the most significant bit. So AFAICT this docstring has the two switched.
May be good to confirm that the output of
from_bool_array()
matches the expected endianness too.How can we reproduce the issue?
See above
What should happen?
See above
Any suggestions?
No response