Closed 243bf073-7a9a-4861-97f6-492d27152381 closed 1 year ago
Bitwise operators have inconsistent behavior when acting on bool values: (Python 3.7.4)
# "&" works like "and"
>>> True & True
True
>>> True & False
False
>>> False & False
False
# "|" works like "or"
>>> True | True
True
>>> True | False
True
>>> False | False
False
# "~" does not work like "not"!
>>> ~True
-2
>>> ~False
-1
The result of this is the a user might start working with "&" and "|" on bool values (for whatever reason) and it will work as expected. But then, when adding "\~" to the mix, things start to break.
The proposal is to make "\~" act like "not" on bool values, i.e. \~True will be False; \~False will be True.
I'm not sure if this has any negative impact on existing code. I don't expect any, but you can never know. If there is no objection to this change, I can even try to implement it myself an submit a patch.
For instances of int
, ~
does bitwise negation (with the usual two's-complement with an infinite number of bits model that Python uses for all bitwise operations on arbitrary-precision integers).
And rightly or wrongly, True
and False
are instances of int
, so it should be possible to use True
almost anywhere you'd usually use 1
, with no change in behaviour. The proposed change would give us True == 1
but ~True != ~1
.
So I think we're stuck with the current behaviour.
Given a time machine, this could arguably be "fixed" by making True
equal to -1
rather than 1
... But absent that time machine, I'd expect some amount of breakage from the proposed change.
It's worth noting that NumPy's bool_
type does do this:
>>> import numpy as np
>>> ~np.bool_(True)
False
>>> ~np.bool_(False)
True
But np.bool_
doesn't have the same "is-a" relationship with integers:
>>> np.bool_.__mro__
(<class 'numpy.bool_'>, <class 'numpy.generic'>, <class 'object'>)
IOW, -1 from me.
Looks like this is essentially a duplicate of bpo-12447
+0 This proposal may be worth re-considering. I've seen the problem arise in practice on multiple occasions. I suspect that it will continue to give people trouble.
Right now, a bool is-a int that 1) only has two singleton instances equal to zero and one, 2) has a different repr, and 3) has the & | and ^ operations redefined to return instances of bool.
I think we could also override the ~ operation. That would be a Liskov violation, making bools slightly less substitutable for ints, but it does so in a way that is intuitive and likely to match what a user intends when inverting a bool.
See also the discussion started by Antoine here:
https://mail.python.org/pipermail//python-ideas/2016-April/039488.html
-1 from me too.
Making True == -1 looks interesting, but it has drawbacks.
Making True == -1 looks interesting, but it has drawbacks.
Yes, please ignore that part of my post. :-) It shouldn't be considered seriously until a time machine turns up (and probably not even then).
My main worry with the proposed change is accidental breakage from the change in meaning. I've so far failed to find any examples of real-world functions that could/would be broken - the closest I've come is floating-point bit-pattern manipulation functions (constructing a bit-string from a sign, exponent and significand, where it's quite natural to treat the sign both as an "is_negative" boolean and as a 0-or-1 integer). But that case didn't involve a ~sign
at any point, so it doesn't count.
Still, I have a nagging suspicion that such a function will turn up if we make this change.
Having \~True *not* be the same as \~1 feels like a bigger surprise to me than having \~True not be False; it breaks my simple mental model that bools always behave like ints in numeric contexts.
There's also the minor annoyance that there isn't currently an obvious safe way to convert an integer-like thing to an actual int, to make sure that bools do the right thing in a numeric context. operator.index *ought* to be that obvious way, but it leaves bools untouched.
>>> operator.index(True)
True
Mark, isn't int()
the obvious way "to convert an integer-like thing to an actual int"?
>>> int(True)
1
>>> int(False)
0
For the rest, I'm -True on making ~ do something magical for bools inconsistent with what it does for ints. "is-a" is a promise.
For reference, the link to the Python-Ideas discussion in Mailman 3:
isn't
int()
the obvious way "to convert an integer-like thing to an actual int"?
Well sorta, except that it's too lenient, letting in strings, floats, Decimal instances and the like. The strings isn't so much of an issue - it's more the silent truncation with the floats and Decimals that's problematic.
Okay, we'll just continue to tell users "you're holding it wrong" ;-)
It may have been a mistake to directly support | & and ^. That implies ~ should work. The fallback is to use "not" but that looks weird and has the wrong operator precedence:
(not a) ^ (not b & c)
I don't agree that "\~" doesn't "work". If people are reading it as "not", they're in error. The Python docs say \~x
means
the bits of x inverted
and that's what it does. There's no sense it which it was _intended_ to be logical negation, no more in Python than in C (C has the distinct unary prefix "!" operator for truthiness negation, and, as you note, Python has "not").
It's educational ;-) to learn how analogies between ints and bools can break down. For logical negation in the sense they want here, it's not "\~x" they want but "\~x & 1" - that is, if someone insists on using an inappropriate operator.
BTW, I should clarify that I think the real "sin" here was making bool a subclass of int to begin with. For example, there's no sane reason at all for bools to support division, and no reason for a distinct type not to define "\~" any way it feels like. Advertised inheritance is usually a Bad Idea ;-)
\<snark> We could extend bool with shades of grey that close the 2-bit, signed set over the complement: {-2, -1, 0, 1}. For example, the bitwise complement of False could be RealNews (-1, 0x11) and the bitwise complement of True could be FakeNews (-2, 0x10). The bool() value some of built-in objects could be declared as RealNews or FakeNews by decree of the steering committee. For other projects this would have to be subject to opinion, which will probably lead to endless internal debates and flame wars.
In a boolean context, FakeNews would be falsey and RealNews would be truthy. For bitwise operations, we have the following:
\~False -> RealNews \~True -> FakeNews
False & FakeNews -> False False & RealNews -> False True & FakeNews -> False True & RealNews -> True RealNews & FakeNews -> FakeNews
False | FakeNews -> FakeNews False | RealNews -> RealNews True | FakeNews -> RealNews True | RealNews -> RealNews RealNews | FakeNews -> RealNews
False ^ FakeNews -> FakeNews False ^ RealNews -> RealNews True ^ FakeNews -> RealNews True ^ RealNews -> FakeNews RealNews ^ FakeNews -> True
\</snark>
:-D
Essentially we've got two competing desires:
Given that & | and ^ are closed under bools, it would be nice for ~ to be closed as well. NOT isn't a reasonable alternative because of its operator precedence.
Given that bool is a subclass of int, its operations should give results equivalent to what you would get for ints.
When reopening this, my thought was that the first desire should win based on practicality- beats-purity. In the context of bools, the current ~ operator violates user expectations and there isn't a reasonable alternative that has the correct precedence.
But in the face of opposition to the idea, am willing to just let it die. In the scheme of things, it isn't important.
[Raymond]
Given that & | and ^ are closed under bools [...]
So maybe the right fix is to change that fact? I'm not sure what the value of having True & True return True rather than 1 is, beyond misleading people into thinking that bitwise operators "just work" as logical operators on bools. Having True & True give 1 would send a clearer message that "yes, this works, but only because of the bool-is-an-int relationship, and it's not the right way to do logical operations".
Does anyone know what the rationale was for having & and | on bools return bools in the first place?
Does anyone know what the rationale was for having & and | on bools return bools in the first place?
Besides the fact that they can be defined on bool in compatible way and these operators often are used for booleans?
It was in the initial version of PEP-285 and I have not found any questions or discussion about it.
It never occurred to me that making b&b an b|b return bool would be considered a bad thing just because \~b is not a bool. That's like complaining that 1+1 returns an int rather than a float for consistency with 1/2 returning a float.
Because bool is embedded in int, it's okay to return a bool value that compares equal to the int from the corresponding int operation. Code that accepts ints and is passed bools will continue to work. But if we were to make ~b return not b
, that makes bool not embedded in int (for the sake of numeric operations).
Take for example
def f(a: int) -> int:
return ~a
I don't think it's a good idea to make f(0) != f(False).
Because bool is embedded in int, it's okay to return a bool value *that compares equal to the int from the corresponding int operation*.
Agreed that it's okay, but I'd like to understand why it's considered desirable. What use-cases benefit from having x | y
give True
or False
rather than 1
or 0
when x
and y
are bools? Is the intent that x & y
and x | y
provide shorter ways to spell x and y
, x or y
, or (as I think Serhiy's suggesting) is this about catering to people coming from other languages and expecting &
and |
to be the right operations for doing logic with bools?
From my integer-centric point of view, | and & are bitwise integer operations, not logical operations; they only happen to apply to bool because a bool is an int, but they're not natural boolean operations (in exactly the same way that +, -, *, etc. aren't natural boolean operations). "and" and "or" seem the "one obvious way to do it" for logical operations on bools; I don't think I understand why anyone would want to use | and & on bools to get another bool, instead of just using or
and and
.
> Because bool is embedded in int, it's okay to return a bool value *that compares equal to the int from the corresponding int operation*.
Agreed that it's okay, but I'd like to understand why it's considered desirable. What use-cases benefit from having
x | y
giveTrue
orFalse
rather than1
or0
whenx
andy
are bools? Is the intent thatx & y
andx | y
provide shorter ways to spellx and y
,x or y
, or (as I think Serhiy's suggesting) is this about catering to people coming from other languages and expecting&
and|
to be the right operations for doing logic with bools?From my integer-centric point of view, | and & are bitwise integer operations, not logical operations; they only happen to apply to bool because a bool is an int, but they're not natural boolean operations (in exactly the same way that +, -, *, etc. aren't natural boolean operations). "and" and "or" seem the "one obvious way to do it" for logical operations on bools; I don't think I understand why anyone would want to use | and & on bools to get another bool, instead of just using
or
andand
.
For one thing, you can override &
and |
but you can't override and
and or
.
Probably when we introduced bool we should have thought harder about it, but I don't think we should change anything at this point, so I'm not sure why whether it's worth trying to uncover the original deep motivations (probably they weren't so deep).
s/book/bool/
I would like to reconsider this topic.
From a user point of view:
~
on bool is commonly seen as negation by users because it works like this in numpy~
to work on bool as logical operations~
on bool
behavior is inherited from int
, but I suppose this inheritance relation is not clear to many usersAs a result, chances are high, that users use ~
on bool
with the incorrect expectation of boolean logic - Not so great. Can we do anything better than the current state?
Proposal: Deprecate ~
on bool
and raise an error in the future.
On the downside, this would still violate Liskov's Principle, but maybe less bad than changing to another result with logical operation semantics. It may break very few usecases where people intentionally want the bitwise operation on the underlying int representation, but they could move from ~val
to the more explicit ~int(val)
.
Overall we may consider to bear this breakage to make people aware of the likely incorrect usage.
Proposal: Deprecate
~
onbool
and raise an error in the future.
Yeah, that sounds reasonable. Do you want to submit a PR that does this in 3.12?
Great! I'm happy to provide a PR, but it will be some days before I get to it.
That's fine!
GitHub really needs a feature "ping me on this issue in N days/weeks/months" :-)
@timhoffm The deadline for getting this into 3.12 is in about a month.
Thanks for the reminder. I may get around this next week.
@iritkatriel the PR #103487 is ready. Can I still help with anything to get this into 3.12?
I don't think I understand why anyone would want to use | and & on bools to get another bool, instead of just using
or
andand
.
I think I've had cases where I wanted a bool and needed both operands evaluated, so or
and and
wouldn't have worked.
Futher discussion: https://discuss.python.org/t/bool-deprecation/62232
Note: these values reflect the state of the issue at the time it was migrated and might not reflect the current state.
Show more details
GitHub fields: ```python assignee = None closed_at =
created_at =
labels = ['interpreter-core', 'type-feature', '3.7']
title = 'bool(~True) == True'
updated_at =
user = 'https://github.com/tomerv'
```
bugs.python.org fields:
```python
activity =
actor = 'gvanrossum'
assignee = 'none'
closed = True
closed_date =
closer = 'rhettinger'
components = ['Interpreter Core']
creation =
creator = 'tomerv'
dependencies = []
files = []
hgrepos = []
issue_num = 37831
keywords = []
message_count = 24.0
messages = ['349452', '349459', '349461', '349477', '349478', '349480', '349482', '349484', '349487', '349488', '349489', '349493', '349494', '349495', '349508', '349509', '349522', '349628', '349656', '349661', '349707', '349709', '349723', '349724']
nosy_count = 7.0
nosy_names = ['gvanrossum', 'tim.peters', 'rhettinger', 'mark.dickinson', 'serhiy.storchaka', 'eryksun', 'tomerv']
pr_nums = []
priority = 'normal'
resolution = 'duplicate'
stage = 'resolved'
status = 'closed'
superseder = '12447'
type = 'enhancement'
url = 'https://bugs.python.org/issue37831'
versions = ['Python 3.7']
```
Linked PRs