Open aturon opened 9 years ago
Whatever happened to this? =P
(is it still backwards compatible?)
We document:
https://doc.rust-lang.org/1.34.0/std/primitive.bool.html
If you cast a
bool
into an integer,true
will be 1 andfalse
will be 0.
Casting here likely refers to the as
operator.
However as far as I can tell we currently make no promise about the memory representation of bool
(other than size_of
being 1 byte). We may want to keep it that way, for https://github.com/rust-lang/rfcs/pull/954.
If bool
is a primitive rather an enum, we can (more easily) special-case its casting behavior in the language.
Wouldn't the casting behaviour and guaranteed size_of still be possible with repr(u8)
? (even though that might be a problem with #954)
#[repr(u8)]
enum bool {
false = 0,
true = 1,
}
The problem with representing bool
as an enum is that it's either backwards-compatible (bool
, true
, false
) or it conforms to the CamelCase naming convention for enums defined in RFC #430 (Bool
, True
, False
), not both.
@gorilskij I don't see how this is a big issue. Obviously we have to preserve true
and false
and be inconsistent with CamelCase
How is this that bad though?
@KrishnaSannasi I would tend to agree, both for compatibility and because bool
is a basic enough type to warrant a lower case name. It's not bad, just unpleasantly inconsistent, especially in a language as relatively polished as Rust. It's not nice to have a builtin type that, were you to write it yourself, would raise a compiler warning.
There is another, more serious point. As proven by the fact that false
is 0
and true
is 1
, bool
is more than just a simple 2-variant enum
, it has a u1
kind of numeric feeling to it, logical operations have a numerical operation feeling to them as well as evidenced by some of the notation (&&
~= *
, ||
~= +
, ...) not to mention the fact that all those operations as well as the if
/else
construct depend on the bool
type, it would be possible to do logic without all of that but with how fundamentally useful it is, I think it (bool
and friends) deserves it's special status in Rust as in the vast majority of other languages.
As an example to illustrate what I mean, Haskell implements Bool
as a true 2-variant type and defines the infix operators (&&
, ||
, ...) in the standard way to go along with it (note that logical not is called not
, not !
to maintain consistency because it's a prefix operator, though Haskell does still bend the rules for prefix -
but that's beside the point). This, I believe, is nice and idealistic but not suited to a more practical language like Rust.
TL;DR I think making bool
an enum
changes nothing from the implementation perspective and goes half-way in a direction I'm not sure we should be going from the purity perspective.
@gorilskij
There is another, more serious point. As proven by the fact that false is 0 and true is 1, bool is more than just a simple 2-variant enum, it has a u1 kind of numeric feeling to it
I would argue that a u1
is just a 2-variant enum, and that any numeric properties it has are just semantics for us, but don't really need to be encoded into it's representation.
Also a bool
can be represented as @oberien showed earlier,
#[repr(u8)]
enum bool {
false = 0,
true = 1
}
and then special casing &&
and ||
. The other operators (like !
) can be implemented normally.
Note that there is movement to make &&
and ||
overloadable like all of the other operators. If that lands, then we can implement &&
and ||
normally for bool
as well.
One more advantage to this is that it reduces the number of keywords in Rust, because , bool
true
and false
no longer have to be keywords.
edit: bool
is not a keyword, my bad
Perhaps you're right. I still think that in an ideal world it would be Bool
, True
and False
, maybe that's something to consider for a future edition. What about if
/else
though? If bool
is made to look like any other 2-variant enum won't it seem strange that it alone can be used in those statements? Opening up if
/else
to any 2-variant enum seems even more weird and magical.
What about if/else though? If bool is made to look like any other 2-variant enum won't it seem strange that it alone can be used in those statements? Opening up if/else to any 2-variant enum seems even more weird and magical.
Well, if
and else
are kinda special. We could define that
if $cond {
$block
}
gets desugared to
match $cond {
bool::true => $block,
bool::false => (),
}
And similarly for if $cond { $block_t } else { $block_f }
we could desugar to a match, this way it is still clear what is happening.
I still think that in an ideal world it would be Bool, True and False, maybe that's something to consider for a future edition
I don't think an edition can do this, it seems like to big of a breaking change.
Alternately, if $cond
could be desugared to if let bool::true = $cond
, which is more consistent and also works for match guards (although if let
match guards are not actually implemented yet).
edit: Actually, that's a bad idea because it conflicts with the proposal to make let
an expression.
I understand that if
and if
/else
can be treated as pure sugar but they are nonetheless (together with while
which, as far as I can see, can't be desugared without recursion) very fundamental constructs in the language and very fundamentally depend on the bool
type. Making the type an ordinary enum seems somehow dirty as it would lose it's intrinsic specialness yet retain a lot of special treatment, making it, as far as I can tell, the only enum "allowed" (without compiler complaint) to have a lowercase name seems like more of the same. I like reducing keywords and removing special behavior as much as anyone but I think this proposal does that too superficially.
if, while, for and co. are already desugared to loop + match.
“Language items” are not dirty, they are often necessary. For example for
loops need to know about the IntoIterator
and Iterator
traits, as well as the Option
enum.
I think there is no real blocker to making bool
an enum. We can make it work. But I also don’t see much benefits, so it may not be worth the bother.
But I also don’t see much benefits, so it may not be worth the bother.
I don't see any benefits other than "it looks cleaner" which I'd argue it doesn't due to, at least, the naming convention inconsistency.
No-one is suggesting we change the name. Using an enum instead of a bool would simplify some special-casing in the compiler, which would be an advantage.
I was talking about the issue purely from a user's point of view, I have no knowledge of compiler internals (or experience in this area for that matter). Just out of curiosity, how would switching to an enum representation help? It seems to me that a primitive type would be easier to write special cases for.
(btw, if this is not the place for such discussion, tell me, I'll move it somewhere else)
@gorilskij as @varkor said it would reduce special casing the compiler, so it would make the compiler simpler. Special casing only makes things more complex, not the other way around. For some things the complexity is justified, but with bool
we could easily make it less special and thus simplify the compiler. This makes it easier to maintain and contribute to.
Oh, sorry, I misread it as "it would make it simpler to implement special casing." Thanks for clarifying.
@varkor What's the way forward on this? It seems like it's a good idea, given that it would reduce the special-casing in the compiler.
@jhpratt: there was a brief discussion on Discord about it a while ago. You'd probably want to talk to @oli-obk or @eddyb about it if you wanted to pursue it, who both had some ideas about it.
After a slight bit of discussion on Reddit, a couple things came up. Would the true
and false
keywords need to be removed? Intuitively, I don't think it would be possible to implement an enum bool { false, true }
unless they weren't keywords (as otherwise raw identifiers would be necessary). Also, would removing keywords be backwards compatible?
I'll (finally) reach out on Discord to them to see if they have any ideas.
We can keep the keywords and have them refer to the variants of the enum. There might be some diagnostics changes to work out, but I don't think it'd be a problem.
I think it's mostly compiler-internal work that is blocking this. I tried it once and it's not super simple to do. Loads of code expects to know that something is a bool or make something a bool, and for this we right now have a very convenient way (tcx.types.bool
). With this PR that would have to change to tcx.type_of(tcx.require_lang_item(lang_items::BoolEnum))
. While we can totally abstract this to tcx.mk_bool()
, it's still not the zero cost thing we had before and I think there were other inconveniences, too. So maybe there could be a preliminary PR that just removes tcx.types.bool
in preparation of this change, so we can judge some of the fallout.
I have doubts that this "reduce special casing the compiler", bool
is indeed special and built-in for the compiler in many senses.
I suspect it would be much easier to offload e.g. !
to the library (as an empty enum) than bool
.
I'd think the similarities appear only in the nich layout stuff, so maybe some future layout code would be simpler if it could optimize some bools as enum discriminants.
At present, we always permit fields references, which imposes alignment constraints, and makes the layout less interesting. We pack enum discriminants into left over space because one cannot take a separate reference to the discriminants. I'd think composite types could conceivably restrict references to specific fields with some notation like !ref T
or #[no_ref] T
or #[packed] T
.
As an example, size_of::<(bool, bool)>() = 2
while size_of::<(!ref bool, !ref bool)>() = 1
, and ideally even size_of::<(!ref Option<bool>, !ref Option<bool>)>() = 1
, but you cannot call &self
methods on a !ref Option<bool>
without first destructuring. You might conceivably treat Copy
types even more special here, not sure. It's likely folks would prefer bitfield notations like in C for this anyways.
At first blush anything like this sounds like an internal compiler concern, so not an issue worth considering right now.
Layout considerations are completely orthogonal to whether bool
becomes an enum
or not. If bool
keeps not being an enum, this is just a few lines of code in a very encapsulated piece of code that would be affected for layout changes.
I think that 2014 or earlier would have been a good time for a change like this.
At this point doing it without breaking things would take non-trivial care while the benefits as far as I can tell are entirely theoretical. It would feel nice to have fewer special cases in the language, but is there any concrete benefit to users?
I think that formally deciding not to make this change (as opposed to it being merely postponed) should be an option on the table for @rust-lang/lang.
I would personally like to avoid considering whether bool
should become a library type right now. I would also like to like to avoid making a formal decision that it should never become a library type as well. I'm personally happy with indecision for the foreseeable future. :)
@SimonSapin What's the harm in leaving this open, in case someone wanted to at least try it to see the admittedly uncertain benefits? That's what I was going to do, but @Centril said it would likely conflict with other ongoing work.
it would likely conflict with other ongoing work
It sounds like you answered your own question?
If this were a formal RFC submitted now, what should the Motivation section contain?
From the template:
Why are we doing this? What use cases does it support? What is the expected outcome?
The conflict with ongoing work would be temporary (iirc he said if-let chains), as the bits it would touch would eventually be completed. My interpretation of your original comment was a permanent (or at least indefinite) declination of the (pseudo-)RFC.
I think that formally deciding not to make this change (as opposed to it being merely postponed) should be an option on the table for @rust-lang/lang.
This is a compiler implementation detail and doesn't affect the language. The only possible user-facing change this could have is changing diagnostics, but these are easily special-cased anyway. While I don't think this ought to be a priority, if someone wants to try changing it to see whether the consequences for code clarity are advantageous, there's no reason to dissuade them (other than their time perhaps being better placed with more useful features).
why not use just 1 bit for bool
like ziglang?
https://ziglang.org/documentation/0.9.1/#packed-struct
"bool fields use exactly 1 bit."
use just 1 bit for
bool
If that were the case, then you wouldn't be able to safely take a reference to the boolean. Note that the section you are referring to is titled packed struct.
use just 1 bit for
bool
If that were the case, then you wouldn't be able to safely take a reference to the boolean. Note that the section you are referring to is titled packed struct.
https://ziglang.org/documentation/0.9.1/#boolToInt @boolToInt
@boolToInt(value: bool) u1
Converts true to @as(u1, 1)
and false to @as(u1, 0)
.
If the value is known at compile-time, the return type is comptime_int
instead of u1
.
If you are responding to me, it’s not clear what the argument/point of the response is.
You asked why a Boolean can’t (universally?) be represented as a single bit. The answer is that you can’t take a reference to a bit (memory addresses only have byte granularity) and values in Rust can generally be referenced.
You’ll need to clarify what point your response is trying to make.
For what it's worth rust actually does use the i1
LLVM type to represent booleans. Bools only taking one byte within structs is a packed structs issue, so not relevant to this issue
Currently,
bool
is not anenum
, but there's no strong reason for that state of affairs. Making it anenum
would increase overall consistency in the language.This is a backwards-compatible change (depending on the specifics) and has been postponed till after 1.0.
See https://github.com/rust-lang/rfcs/pull/330.