rust-lang / rfcs

RFCs for changes to Rust
https://rust-lang.github.io/rfcs/
Apache License 2.0
5.8k stars 1.55k forks source link

Make `bool` an `enum` #348

Open aturon opened 9 years ago

aturon commented 9 years ago

Currently, bool is not an enum, but there's no strong reason for that state of affairs. Making it an enum would increase overall consistency in the language.

This is a backwards-compatible change (depending on the specifics) and has been postponed till after 1.0.

See https://github.com/rust-lang/rfcs/pull/330.

Centril commented 5 years ago

Whatever happened to this? =P

(is it still backwards compatible?)

SimonSapin commented 5 years ago

We document:

https://doc.rust-lang.org/1.34.0/std/primitive.bool.html

If you cast a bool into an integer, true will be 1 and false will be 0.

Casting here likely refers to the as operator.

However as far as I can tell we currently make no promise about the memory representation of bool (other than size_of being 1 byte). We may want to keep it that way, for https://github.com/rust-lang/rfcs/pull/954.

If bool is a primitive rather an enum, we can (more easily) special-case its casting behavior in the language.

oberien commented 5 years ago

Wouldn't the casting behaviour and guaranteed size_of still be possible with repr(u8)? (even though that might be a problem with #954)

#[repr(u8)]
enum bool {
    false = 0,
    true = 1,
}
gorilskij commented 5 years ago

The problem with representing bool as an enum is that it's either backwards-compatible (bool, true, false) or it conforms to the CamelCase naming convention for enums defined in RFC #430 (Bool, True, False), not both.

RustyYato commented 5 years ago

@gorilskij I don't see how this is a big issue. Obviously we have to preserve true and false and be inconsistent with CamelCase How is this that bad though?

gorilskij commented 5 years ago

@KrishnaSannasi I would tend to agree, both for compatibility and because bool is a basic enough type to warrant a lower case name. It's not bad, just unpleasantly inconsistent, especially in a language as relatively polished as Rust. It's not nice to have a builtin type that, were you to write it yourself, would raise a compiler warning.

There is another, more serious point. As proven by the fact that false is 0 and true is 1, bool is more than just a simple 2-variant enum, it has a u1 kind of numeric feeling to it, logical operations have a numerical operation feeling to them as well as evidenced by some of the notation (&& ~= *, || ~= +, ...) not to mention the fact that all those operations as well as the if/else construct depend on the bool type, it would be possible to do logic without all of that but with how fundamentally useful it is, I think it (bool and friends) deserves it's special status in Rust as in the vast majority of other languages.

As an example to illustrate what I mean, Haskell implements Bool as a true 2-variant type and defines the infix operators (&&, ||, ...) in the standard way to go along with it (note that logical not is called not, not ! to maintain consistency because it's a prefix operator, though Haskell does still bend the rules for prefix - but that's beside the point). This, I believe, is nice and idealistic but not suited to a more practical language like Rust.

TL;DR I think making bool an enum changes nothing from the implementation perspective and goes half-way in a direction I'm not sure we should be going from the purity perspective.

RustyYato commented 5 years ago

@gorilskij

There is another, more serious point. As proven by the fact that false is 0 and true is 1, bool is more than just a simple 2-variant enum, it has a u1 kind of numeric feeling to it

I would argue that a u1 is just a 2-variant enum, and that any numeric properties it has are just semantics for us, but don't really need to be encoded into it's representation.

Also a bool can be represented as @oberien showed earlier,

#[repr(u8)]
enum bool {
    false = 0,
    true = 1
}

and then special casing && and ||. The other operators (like !) can be implemented normally.

Note that there is movement to make && and || overloadable like all of the other operators. If that lands, then we can implement && and || normally for bool as well.

One more advantage to this is that it reduces the number of keywords in Rust, because bool, true and false no longer have to be keywords.

edit: bool is not a keyword, my bad

gorilskij commented 5 years ago

Perhaps you're right. I still think that in an ideal world it would be Bool, True and False, maybe that's something to consider for a future edition. What about if/else though? If bool is made to look like any other 2-variant enum won't it seem strange that it alone can be used in those statements? Opening up if/else to any 2-variant enum seems even more weird and magical.

RustyYato commented 5 years ago

What about if/else though? If bool is made to look like any other 2-variant enum won't it seem strange that it alone can be used in those statements? Opening up if/else to any 2-variant enum seems even more weird and magical.

Well, if and else are kinda special. We could define that

if $cond {
    $block
}

gets desugared to

match $cond {
    bool::true => $block,
    bool::false => (),
}

And similarly for if $cond { $block_t } else { $block_f } we could desugar to a match, this way it is still clear what is happening.

I still think that in an ideal world it would be Bool, True and False, maybe that's something to consider for a future edition

I don't think an edition can do this, it seems like to big of a breaking change.

comex commented 5 years ago

Alternately, if $cond could be desugared to if let bool::true = $cond, which is more consistent and also works for match guards (although if let match guards are not actually implemented yet).

edit: Actually, that's a bad idea because it conflicts with the proposal to make let an expression.

gorilskij commented 5 years ago

I understand that if and if/else can be treated as pure sugar but they are nonetheless (together with while which, as far as I can see, can't be desugared without recursion) very fundamental constructs in the language and very fundamentally depend on the bool type. Making the type an ordinary enum seems somehow dirty as it would lose it's intrinsic specialness yet retain a lot of special treatment, making it, as far as I can tell, the only enum "allowed" (without compiler complaint) to have a lowercase name seems like more of the same. I like reducing keywords and removing special behavior as much as anyone but I think this proposal does that too superficially.

CryZe commented 5 years ago

if, while, for and co. are already desugared to loop + match.

SimonSapin commented 5 years ago

“Language items” are not dirty, they are often necessary. For example for loops need to know about the IntoIterator and Iterator traits, as well as the Option enum.

I think there is no real blocker to making bool an enum. We can make it work. But I also don’t see much benefits, so it may not be worth the bother.

gorilskij commented 5 years ago

But I also don’t see much benefits, so it may not be worth the bother.

I don't see any benefits other than "it looks cleaner" which I'd argue it doesn't due to, at least, the naming convention inconsistency.

varkor commented 5 years ago

No-one is suggesting we change the name. Using an enum instead of a bool would simplify some special-casing in the compiler, which would be an advantage.

gorilskij commented 5 years ago

I was talking about the issue purely from a user's point of view, I have no knowledge of compiler internals (or experience in this area for that matter). Just out of curiosity, how would switching to an enum representation help? It seems to me that a primitive type would be easier to write special cases for.

(btw, if this is not the place for such discussion, tell me, I'll move it somewhere else)

RustyYato commented 5 years ago

@gorilskij as @varkor said it would reduce special casing the compiler, so it would make the compiler simpler. Special casing only makes things more complex, not the other way around. For some things the complexity is justified, but with bool we could easily make it less special and thus simplify the compiler. This makes it easier to maintain and contribute to.

gorilskij commented 5 years ago

Oh, sorry, I misread it as "it would make it simpler to implement special casing." Thanks for clarifying.

jhpratt commented 4 years ago

@varkor What's the way forward on this? It seems like it's a good idea, given that it would reduce the special-casing in the compiler.

varkor commented 4 years ago

@jhpratt: there was a brief discussion on Discord about it a while ago. You'd probably want to talk to @oli-obk or @eddyb about it if you wanted to pursue it, who both had some ideas about it.

jhpratt commented 4 years ago

After a slight bit of discussion on Reddit, a couple things came up. Would the true and false keywords need to be removed? Intuitively, I don't think it would be possible to implement an enum bool { false, true } unless they weren't keywords (as otherwise raw identifiers would be necessary). Also, would removing keywords be backwards compatible?

I'll (finally) reach out on Discord to them to see if they have any ideas.

varkor commented 4 years ago

We can keep the keywords and have them refer to the variants of the enum. There might be some diagnostics changes to work out, but I don't think it'd be a problem.

oli-obk commented 4 years ago

I think it's mostly compiler-internal work that is blocking this. I tried it once and it's not super simple to do. Loads of code expects to know that something is a bool or make something a bool, and for this we right now have a very convenient way (tcx.types.bool). With this PR that would have to change to tcx.type_of(tcx.require_lang_item(lang_items::BoolEnum)). While we can totally abstract this to tcx.mk_bool(), it's still not the zero cost thing we had before and I think there were other inconveniences, too. So maybe there could be a preliminary PR that just removes tcx.types.bool in preparation of this change, so we can judge some of the fallout.

petrochenkov commented 4 years ago

I have doubts that this "reduce special casing the compiler", bool is indeed special and built-in for the compiler in many senses. I suspect it would be much easier to offload e.g. ! to the library (as an empty enum) than bool.

burdges commented 4 years ago

I'd think the similarities appear only in the nich layout stuff, so maybe some future layout code would be simpler if it could optimize some bools as enum discriminants.

At present, we always permit fields references, which imposes alignment constraints, and makes the layout less interesting. We pack enum discriminants into left over space because one cannot take a separate reference to the discriminants. I'd think composite types could conceivably restrict references to specific fields with some notation like !ref T or #[no_ref] T or #[packed] T.

As an example, size_of::<(bool, bool)>() = 2 while size_of::<(!ref bool, !ref bool)>() = 1, and ideally even size_of::<(!ref Option<bool>, !ref Option<bool>)>() = 1, but you cannot call &self methods on a !ref Option<bool> without first destructuring. You might conceivably treat Copy types even more special here, not sure. It's likely folks would prefer bitfield notations like in C for this anyways.

At first blush anything like this sounds like an internal compiler concern, so not an issue worth considering right now.

oli-obk commented 4 years ago

Layout considerations are completely orthogonal to whether bool becomes an enum or not. If bool keeps not being an enum, this is just a few lines of code in a very encapsulated piece of code that would be affected for layout changes.

SimonSapin commented 4 years ago

I think that 2014 or earlier would have been a good time for a change like this.

At this point doing it without breaking things would take non-trivial care while the benefits as far as I can tell are entirely theoretical. It would feel nice to have fewer special cases in the language, but is there any concrete benefit to users?

I think that formally deciding not to make this change (as opposed to it being merely postponed) should be an option on the table for @rust-lang/lang.

Centril commented 4 years ago

I would personally like to avoid considering whether bool should become a library type right now. I would also like to like to avoid making a formal decision that it should never become a library type as well. I'm personally happy with indecision for the foreseeable future. :)

jhpratt commented 4 years ago

@SimonSapin What's the harm in leaving this open, in case someone wanted to at least try it to see the admittedly uncertain benefits? That's what I was going to do, but @Centril said it would likely conflict with other ongoing work.

SimonSapin commented 4 years ago

it would likely conflict with other ongoing work

It sounds like you answered your own question?

SimonSapin commented 4 years ago

If this were a formal RFC submitted now, what should the Motivation section contain?

From the template:

Why are we doing this? What use cases does it support? What is the expected outcome?

jhpratt commented 4 years ago

The conflict with ongoing work would be temporary (iirc he said if-let chains), as the bits it would touch would eventually be completed. My interpretation of your original comment was a permanent (or at least indefinite) declination of the (pseudo-)RFC.

varkor commented 4 years ago

I think that formally deciding not to make this change (as opposed to it being merely postponed) should be an option on the table for @rust-lang/lang.

This is a compiler implementation detail and doesn't affect the language. The only possible user-facing change this could have is changing diagnostics, but these are easily special-cased anyway. While I don't think this ought to be a priority, if someone wants to try changing it to see whether the consequences for code clarity are advantageous, there's no reason to dissuade them (other than their time perhaps being better placed with more useful features).

igotfr commented 2 years ago

why not use just 1 bit for bool like ziglang? https://ziglang.org/documentation/0.9.1/#packed-struct "bool fields use exactly 1 bit."

shepmaster commented 2 years ago

use just 1 bit for bool

If that were the case, then you wouldn't be able to safely take a reference to the boolean. Note that the section you are referring to is titled packed struct.

igotfr commented 2 years ago

use just 1 bit for bool

If that were the case, then you wouldn't be able to safely take a reference to the boolean. Note that the section you are referring to is titled packed struct.

https://ziglang.org/documentation/0.9.1/#boolToInt @boolToInt

@boolToInt(value: bool) u1

Converts true to @as(u1, 1) and false to @as(u1, 0).

If the value is known at compile-time, the return type is comptime_int instead of u1.

shepmaster commented 2 years ago

If you are responding to me, it’s not clear what the argument/point of the response is.

You asked why a Boolean can’t (universally?) be represented as a single bit. The answer is that you can’t take a reference to a bit (memory addresses only have byte granularity) and values in Rust can generally be referenced.

You’ll need to clarify what point your response is trying to make.

Kixiron commented 2 years ago

For what it's worth rust actually does use the i1 LLVM type to represent booleans. Bools only taking one byte within structs is a packed structs issue, so not relevant to this issue