Incorrect representation caveat on bitwise operation docs

ncoghlan commented 7 years ago

BPO	29710
Nosy	@tim-one, @mdickinson, @ncoghlan, @vadmium, @wm75, @CuriousLearner, @miss-islington
PRs	python/cpython#1691 python/cpython#8508 python/cpython#8509

^{Note: these values reflect the state of the issue at the time it was migrated and might not reflect the current state.}

Show more details

GitHub fields: ```python assignee = None closed_at = created_at = labels = ['type-feature', '3.7', 'docs'] title = 'Incorrect representation caveat on bitwise operation docs' updated_at = user = 'https://github.com/ncoghlan' ``` bugs.python.org fields: ```python activity = actor = 'miss-islington' assignee = 'docs@python' closed = True closed_date = closer = 'ncoghlan' components = ['Documentation'] creation = creator = 'ncoghlan' dependencies = [] files = [] hgrepos = [] issue_num = 29710 keywords = ['patch'] message_count = 33.0 messages = ['288890', '288893', '288899', '288901', '288998', '293759', '294090', '294091', '305616', '305618', '305620', '305642', '307471', '307482', '307485', '321652', '321677', '321679', '321682', '321683', '321754', '321756', '321996', '322237', '322239', '322242', '322245', '322246', '322248', '322525', '322526', '322579', '322580'] nosy_count = 8.0 nosy_names = ['tim.peters', 'mark.dickinson', 'ncoghlan', 'docs@python', 'martin.panter', 'wolma', 'CuriousLearner', 'miss-islington'] pr_nums = ['1691', '8508', '8509'] priority = 'normal' resolution = 'fixed' stage = 'resolved' status = 'closed' superseder = None type = 'enhancement' url = 'https://bugs.python.org/issue29710' versions = ['Python 2.7', 'Python 3.5', 'Python 3.6', 'Python 3.7'] ```

ncoghlan commented 7 years ago

The docs on bitwise operations at https://docs.python.org/3/library/stdtypes.html#bitwise-operations-on-integer-types include the caveated sentence:

Negative numbers are treated as their 2’s complement value (this assumes that there are enough bits so that no overflow occurs during the operation).

This sentence isn't correct now that integers are always arbitrary length. The bitwise inversion will never overflow, and is instead calculated as "-(n+1)" rather than literally flipping bits in the representation: https://docs.python.org/3/reference/expressions.html#unary-arithmetic-and-bitwise-operations

CuriousLearner commented 7 years ago

Hi, I'm working on this issue.

mdickinson commented 7 years ago

This sentence isn't correct now that integers are always arbitrary length.

It's not really clear what that line in the docs means for Python 2, either: if values x and y both fit in an int, then so do \~x, x|y and x&y. We already assume in the Python source that the underlying representation is two's complement (no padding bits, no trap representation, etc.), so there aren't any complications from platforms where the C representation is ones' complement or sign-magnitude.

mdickinson commented 7 years ago

It's not really clear what that line in the docs means for Python 2, either

Ah, I guess it still kinda sorta applies for the left-shift operator, though even then, Python has promoted the result to long for many versions now. So the only "overflow" that's really relevant on Python 2 is "overflow" from int to long in the case of left shift.

ncoghlan commented 7 years ago

Added 2.7 to the list of affected versions.

Clearly my 2's-complement arithmetic is incredibly rusty, as for some reason I was thinking "\~(-sys.maxint-1)" could overflow, but no, the answer to that is just "sys.maxint" :)

ncoghlan commented 7 years ago

I think the simplest fix to make the docs "not wrong" would be to just delete the part in parentheses.

Beyond that, I'm not quite sure how to concisely describe the actual behaviour, but I think the mention of "2's complement" isn't especially helpful in its current form, since we don't give a clear sense of how the translation from an arbitrary length integer to a suitable 2's complement form is handled.

For ~n, the most concise explanation is the arithmetic equivalent: it is always implemented as -(n+1)

Similarly, for x << n and x >> n, they're now exactly equivalent to x * 2 ** n and x // 2 ** n without any overflow checking or internal representation qualification (as even in Python 2.x, left-shift will automatically promote to a long when needed)

For x | y and x & y, things are a little more subtle, since that's where the internal 2's complement representation comes into play, but you can't just write out the formal definition of 2's complement at the Python level and get the same answer as is given by the binary operators:

>>> -5 & 5
1
>>> -5 | 5
-1

>>> (~-5 + 1) & 5 # Effectively '5 & 5'
5
>>> (~-5 + 1) | 5 # Effectively '5 | 5'
5

>>> -5 | (~5+1) # Effectively '-5 & -5'
-5
>>> -5 & (~5+1) # Effectively '-5 | -5'
-5

The problem here is that the sign bits of the operands matter a great deal, since they affect the sign expansion in the conversion to the 2's complement form, but that detail gets lost if the conversion is done prior to the bitwise binary operator invocation.

One way to get the same results as the interpreter level algorithms is to use a 2's complement bit length of 1 + max(x.bit_length(), y.bit_length(), so the equivalent operations become:

>>> bin(0b1011 & 0b0101) # -5 & 5 -> 1 in 4 bit 2's complement
'0b1'
>>> bin(0b1011 | 0b0101) # -5 | 5 -> -1 in 4 bit 2's complement
'0b1111'

So perhaps a helpful change to make would be to move the note about negative numbers to a numbered footnote in the table, and state that the bitwise binary operations are semantically equivalent to calculations using two's complement in a bit-width of 1 + max(x.bit_length(), y.bit_length().

mdickinson commented 7 years ago

So on re-reading the docs, I think we're misinterpreting this part:

this assumes that there are enough bits so that no overflow occurs during the operation

One way to think of | and & (and ~ and ^, too):

Find a positive integer n such that both x and y can be represented *without overflow* in n-bit two's complement.
Do the computation x | y (or x & y, x ^ y, \~x, as appropriate) in n-bit two's-complement arithmetic, giving an n-bit two's complement result that we re-interpret as a signed integer in the usual way.

I think the "so that no overflow occurs" refers to choosing n sufficient large in the first step above. Note that it doesn't matter what value of n we choose to use, so long as it's large enough: evaluating 5 & -17 will work just as well using 8-bit two's complement as using 23-bit two's complement --- we'll get the same result either way.

(I personally tend to find it easier to think in terms of the infinite 2-adic representation, which is essentially what you get by extending the 0 or 1 sign bit leftwards into an infinite string of 0s or 1s.)

mdickinson commented 7 years ago

Adding Tim Peters to the nosy, since I suspect (without actually having checked) that this is his language.

vadmium commented 7 years ago

I find the model in terms of “bit_length” hard to understand. You have to understand what bit_length returns, and why you added 1. Bit_length is awkward for negative numbers. It only uses the absolute value, which would give off-by-one problems with negative values, so I guess you added 1 to compensate.

I understand the bitwise operations as using two’s complement extended to an unlimited width, so that negative values have a series of ones for the most-significant bits. I presume this is what your “2-adic representation” is. Having this spelled out may have helped when I was learning Python.

ncoghlan commented 7 years ago

Right, and that's why I think we're better off focusing on the arithmetic explanations wherever they apply. The problem is that for "x | y" and "x & y" there's no avoiding discussing the 2's complement representation.

Martin, would you find the reference to bit_length() in the current PR easier to follow if it had a second follow-up sentence like the one below:

\=== Bitwise binary operations are semantically equivalent to calculations using 2's complement in a bit-width of 1 + max(x.bit_length(), y .bit_length(). This choice of bit-width ensures there is sufficient space for the absolute value of both operands, while also providing space for an explicit sign bit (representing the conceptually infinite series of zeros or ones at the left of a 2's complement value). \===

That retains the precision of the currently suggested definition (for the benefit of language implementors), but also spells out the rationale for that definition (the "1 +" is for the sign bit, while the abs() is implicit in the fact that bit_length() assumes 2's complement and hence doesn't allow space for an explicit sign bit).

vadmium commented 7 years ago

Mentioning “sufficient space” is nice because it relieves concerns about overflows and running out of bits. But “sufficient space for the absolute value” seems tricky if not misleading, because two’s complement does not directly record the absolute value, and extreme values like -128 need less bits in two’s complement (7 + sign) than with a separate magnitude (8 + sign).

What about:

“Bitwise operations have the same result as calculations using two’s complement with a bit-width large enough to avoid overflows.”

I’m not sure that a precise definition is necessary, but I would say a bit-width k must be chosen such that -2(k-1) \<= x \< 2(k-1) for all operands x (and for the left shift x \<\< n, the width must be k + n).

mdickinson commented 7 years ago

“Bitwise operations have the same result as calculations using two’s complement with a bit-width large enough to avoid overflows.”

That sounds fine to me, but then, the original wording sounds fine to me now that I know how to read it. :-) The main issue here is making it clear that in "avoid overflow", we're talking about overflow incurred in representing a value in two's complement in the first place, as opposed to overflow of the operation itself.

I'd go with something like the following (which evolved by successive refinement from Martin's suggestion):

"Each bitwise operation has the same result as though carried out in two's complement using a bit-width that's large enough to represent the inputs."

I presume this is what your “2-adic representation” is.

Yes, exactly. Without invoking 2-adic numbers, which is a bit of overkill, there's a natural one-to-one correspondence between

(a) integers, and (b) (singly) infinite bit strings, extending infinitely to the left, in which either all but finitely many bits are zero, or all but finitely many bits are one.

In the domain (b), the bitwise operations have their obvious bitwise meanings; translating via the correspondence gives us the corresponding definitions of the bitwise operations on (a).

For the correspondence: going from (a) to (b): take an integer n, then for each k >= 0 reduce n modulo 2^k to get a length-k bit string. Now it's easy to see that the length-k bit strings are all compatible with one another, in the sense that they all agree with each other when right-aligned, so you naturally get an infinite-length bit string that's eventually either all 1s (for negative n) or all zeros (for nonnegative n).

Going back from (b) to (a): it's not hard to convince yourself that the map above is one-to-one and onto, but then you miss out on a beautiful description of the inverse map: given an infinite bit-string indexed as below, with b_0 the least significant bit:

... b_{k+1} b_k b_{k-1} ... b_2 b_1 b_0

we simply map that bit string to the integer

    n = sum_{i>=0} b_i * 2**i

Of course the RHS of the above is an infinite sum, so a priori doesn't make sense. If almost all the bits are zeros, it becomes a finite sum and everything's okay. If almost all the bits are ones, you can either (i) introduce the 2-adic topology on the integers and point out that it still converges in that topology, so that everything has a sound mathematical footing, or (ii) just use the "trick" that 1 + 2 + 4 + 8 + 16 + ... = -1, which more-or-less amounts to the same thing.

vadmium commented 6 years ago

FWIW I find Mark’s suggestion pretty good:

“Each bitwise operation has the same result as though carried out in two's complement using a bit-width that's large enough to represent the inputs.”

ncoghlan commented 6 years ago

I like Mark's phrasing as well. For precision, I'd still like to give an exact algorithmic formulation of what "large enough" means in this context, though.

Something like:

Each bitwise operation has the same result as though carried out in two's complement using a bit-width that's large enough to represent the inputs. ("Large enough" for this purpose is ``1 + max(x.bit_length(), y

.bit_length()``, with the extra bit being needed to handle sign extension appropriately)

tim-one commented 6 years ago

To answer the old accusation ;-), no, this isn't my wording. I _always_ explain that Python's integer bit operations act as if the integers were stored in 2's-complement representation but with an infinite number of sign bits. That's all. That provides insight.

For example, then it's dead obvious that -1 == ~0 (both an infinite solid string of 1 bits); that for any integer i, -1 ^ i == ~i" (both flip each bit ini); and that for any positive integersiandjit's necessarily the case that-i ^ -j` is positive (because the infinite strings of sign bits cancel out).

The reference manual is under no obligation to explain how to _implement_ this illusion, and I don't think it's helpful to try. People here are struggling to explain how to pick a number of bits "big enough" to make it all work out on a case by case basis, but the single answer "infinity" is big enough to apply in all cases ;-)

ncoghlan commented 6 years ago

I think we have a fairly different notion of what clarity means here - I have no mental concept whatsoever of how to do two's complement arithmetic with an infinite number of sign bits (I learned most of what I know about two's complement by working with fixed point 16-bit and 32-bit microprocessors), so the infinite bits explanation provides me with very little useful insight, whereas I can readily cope with the notion of storing a single extra sign extension bit beyond the minimum required to hold the operands' two's complement representations.

That said, I do like the idea of using infinite precision arithmetic as the formal definition of the intended language semantics, which would lead to wording like the following:

\================= Each bitwise operation has the same result as though carried out in two's complement with an infinite number of sign bits. In practice, performing the calculation with one extra sign extension bit (a bit-width of 1 + max(x.bit_length(), y.bit_length()) is sufficient to get the expected result. \=================

tim-one commented 6 years ago

Nick, that seems a decent compromise. "Infinite string of sign bits" is how Guido & I both thought of it when the semantics of longs were first defined, and others in this report apparently find it natural enough too. It also applies to all 6 operations in the table as-is.

It appears that

a bit-width of ``1 + max(x.bit_length(), y.bit_length()``

only applies as-is to 3 (~ has only one operand, while the bit length of the RHS doesn't matter for << and >>). Provided that's clarified, I'd only suggest inserting "at least" before "one extra sign extension bit" and after "a bit-width of". That's a bridge between the "infinite" and "fixed-albeit-variable-width" views: "plus 1" is the smallest approximation to infinity that works, but anything at least that large works too.

ncoghlan commented 6 years ago

The restriction of the footnote to x & y, x | y, and x ^ y was going to come from the fact that only those rows in the table will reference the new note. However, it likely makes sense to repeat the relevant expressions in the footnote as well, since that makes it clearer what x and y refer to in the second sentence.

Latest proposal:

\================= The results of x | y, x ^ y, and x & y are calculated as though carried out in two's complement with an infinite number of sign bits. In practice, performing the calculation with at least one extra sign extension bit (a working bit-width of 1 + max(x.bit_length(), y.bit_length() or more) is sufficient to get the expected result. \=================

tim-one commented 6 years ago

Well, all 6 operations "are calculated as though carried out in two's complement with an infinite number of sign bits", so I'd float that part out of the footnote and into the main text. When, e.g., you're thinking of ints as bitstrings, it's essentially useless to think of n >> 13 as being equivalent to n // 2**13.

Quick: what's \~0 >> 13? Well, \~0 is an infinite string of 1 bits, so shifting it right by any finite number of bits doesn't change it. That, mathematically, floor(~0 / 8192) = -1 is only interesting if you're thinking of \~0 as being an integer instead.

And, if you are, _then_ you need to know that the infinite string of 1 bits \~0 represents is viewed as being a 2's-complement representation of -1. But so long as you're sticking to the bitstring view, the "2's complement" part is irrelevant to anything these 6 operations do.

Indeed, in the very earliest versions of Python, longs (but not ints!) were viewed as being 1's-complement infinite bitstrings, but "infinite string of sign bits" was just as applicable to what these operations did then. The meaning of what these operations compute _as bitstrings_ is independent of how bitstrings are mapped to and from integers. When longs changed from 1's-comp to 2's-comp only the latter changed; code using longs as bitstrings wasn't affected.

So, in all, there's quite a bit of background these telegraphic docs are glossing over. You (Nick) don't seem to ever think of them as being bitstrings, but what the "bitwise operators" do was driven by the bitstring view. That some of them can also be defined by arithmetic (+ - * /) is secondary. It may well take more text to get that all across than is suitable here, though.

ncoghlan commented 6 years ago

OK, that makes sense to me. Given that, there'd be two changes proposed.

Replace the opening paragraph of https://docs.python.org/3/library/stdtypes.html#bitwise-operations-on-integer-types (the one I originally quoted when opening this issue) with the text:

\===================== Bitwise operations only make sense for integers. The result of bitwise operations is calculated as though carried out in two's complement with an infinite number of sign bits. \=====================

Add a new footnote (4) to the table for the |, ^, and & entries that reads:

\=====================

Performing these calculations with at least one extra sign extension bit in the internal representation (a working bit-width of 1 + max(x.bit_length(), y.bit_length() or more) is sufficient to get the same result as if there were an infinite number of sign bits. \=====================

mdickinson commented 6 years ago

The wording for change 1 looks fine to me.

For change 2, the mention of the internal representation is misleading, since the internal representation of (long) integers in current CPython is effectively sign-magnitude, and so there are some shenanigans to make the operations behave *as though* the internal representation were some form of two's complement [1]. The previously proposed wording (in msg321679) (with the "infinite sign bits" extracted into the main text as Tim suggests) looks fine to me.

[1] https://github.com/python/cpython/blob/c9265c1534b7e62bb9b15460d0420c0c3bb57ff9/Objects/longobject.c#L4415-L4429

tim-one commented 6 years ago

Ya, Mark's got a point there. Perhaps

s/the internal/a finite two's complement/

?

ncoghlan commented 6 years ago

Ah, "the internal representation" was meant to refer a hypothetical representation, rather than literally to CPython's actual implementation, but now that you point it out, I agree my wording is ambiguous. I like Tim's suggested replacement:

\=====================

Performing these calculations with at least one extra sign extension bit in a finite two's complement representation (a working bit-width of 1 + max(x.bit_length(), y.bit_length() or more) is sufficient to get the same result as if there were an infinite number of sign bits. \=====================

mdickinson commented 6 years ago

Performing these calculations with at least one extra sign extension bit in a finite two's complement representation (a working bit-width of 1 + max(x.bit_length(), y.bit_length() or more) is sufficient to get the same result as if there were an infinite number of sign bits.

LGTM

CuriousLearner commented 6 years ago

Seems good to me. I've made the changes in the PR.

tim-one commented 6 years ago

@CuriousLearner, does the PR also include Nick's first suggested change? Here:

"""

Replace the opening paragraph of https://docs.python.org/3/library/stdtypes.html#bitwise-operations-on-integer-types (the one I originally quoted when opening this issue) with the text:

\===================== Bitwise operations only make sense for integers. The result of bitwise operations is calculated as though carried out in two's complement with an infinite number of sign bits. \===================== """

CuriousLearner commented 6 years ago

Hey Tim,

@CuriousLearner, does the PR also include Nick's first suggested change? Here:

""" \===================== Bitwise operations only make sense for integers. The result of bitwise operations is calculated as though carried out in two's complement with an infinite number of sign bits. \===================== """

I think it was then discussed to keep this line as:

""" \=========== Bitwise operations only make sense for integers. Negative numbers are treated as their 2's complement value. \=========== """

Does this needs to be changed?

Here is the link of the PR: https://github.com/python/cpython/pull/1691/files#diff-7498e907ba97646df434a0eb583c6909

tim-one commented 6 years ago

Nick suggested two changes on 2018-07-15 (look above). Mark & I agreed about the first change, so it wasn't mentioned again after that. All the rest has been refining the second change.

CuriousLearner commented 6 years ago

On, yes, I think I missed the first point, earlier. Thank You! I did the changes.

ncoghlan commented 6 years ago

New changeset b4bc5cab82e6855e4ebc33ba0b669ddffad30fb3 by Nick Coghlan (Sanyam Khurana) in branch 'master': bpo-29710: Clarify documentation for Bitwise binary operation (GH-1691) https://github.com/python/cpython/commit/b4bc5cab82e6855e4ebc33ba0b669ddffad30fb3

ncoghlan commented 6 years ago

Mark & Tim: thanks for the discussion and clarifications above, I learned some new things myself!

Sanyam: thanks for the patch, and for your patience while we figured out what we wanted the new wording to actually say :)

(Technically the backport PRs are still in progress, but I'll track that through the GitHub notifications)

miss-islington commented 6 years ago

New changeset 8764a6ffda896af4586f07b55d7df916f86dd9b0 by Miss Islington (bot) in branch '3.7': bpo-29710: Clarify documentation for Bitwise binary operation (GH-1691) https://github.com/python/cpython/commit/8764a6ffda896af4586f07b55d7df916f86dd9b0

miss-islington commented 6 years ago

New changeset 3100b7e710dccdcfbc6991ea7e8985a1881d42e6 by Miss Islington (bot) in branch '3.6': bpo-29710: Clarify documentation for Bitwise binary operation (GH-1691) https://github.com/python/cpython/commit/3100b7e710dccdcfbc6991ea7e8985a1881d42e6

python / cpython

Incorrect representation caveat on bitwise operation docs #73896