masak / alma

ALgoloid with MAcros -- a language with Algol-family syntax where macros take center stage
Artistic License 2.0
139 stars 15 forks source link

Introduce a Val::Bool type #157

Closed masak closed 8 years ago

masak commented 8 years ago

I'm seriously considering whether we shouldn't just add a Val::Bool type to 007 before v1.0.0, just so we won't have to make sacrifices based on backwards compatibility later.

Currently, 007 does 0 and 1 instead of false and true, like so many systems before it (C comes to mind, early versions of Python, Perl 5 has "" and 1...)

$ bin/007 -e='for [1 != 1, 1 == 1] -> v { say(v) }'
0
1

Most of the time, we're more interested in the concepts "falsy" and "truthy" than we are in actual True and False values, but... it seems that languages tend to land on having an explicit boolean type as they mature. (Java, Python, Perl 6, JavaScript (from the start)...)

The reason is probably something like "it's nice". Especially when you print things, it's nicer to see those words rather than the rather arbitrary numbers 0 and 1.

The reason I want to do this before v1.0.0 is mostly so that we don't have to make compromises we'd prefer not to make, for example, letting boolean values numify back to 0 and 1.

Python is actually a funny case here — it makes bool a subtype of int.

>>> isinstance(True, int)
True
>>> issubclass(bool, int)
True

This is not just odd from my point of view — how many things do I expect to do with integers that I can't really do with booleans? — but it also feels like building damage from an old backwards-compatibility thing right into the type system.

I would love it if we could implement our Bool type in 007 as an enum type. (Ditto None.) But I'm willing to get Bool first and enums later, if that turns out to be easier.

vendethiel commented 8 years ago

Perl 6 also somewhat makes Bool subtypes of Int, by virtue of being enums.

masak commented 8 years ago

Indeed.

$ perl6 -e 'say True ~~ Int'
True

In fact, now that I think about it, the claim Bool has to being an integer type hinges on booleans being quite close to bits. This is further confounded by the fact that we're discussing this on a computer medium, were booleans happen to be implemented as bits. Bits are more clearly an integer type, or at least a type where arithmetic operations and boolean operations sort of fuse.

I think my argument is made stronger by the above, though. Bool is the type that emerges when one starts to see clearly enough not to be confounded by the dual arithmetic/logical nature of bits.

One use case we'll lose if we make Bool not a numeric type is something like this:

say(array[bool]);     # prints either array[0] or array[1]

But that's not a big loss, and I think I'd rather see something like that expressed through an explicit ?? !! construction, or similar.

vendethiel commented 8 years ago

I don't remember where it was, but I remember someone suggesting a truthy value would return 42 to prevent such cuteisms..:)

masak commented 8 years ago

It was Randal Schwartz many years ago on the p6l mailing list, IIRC.

His sudden complaining about that particular detail has stayed with me, too. It's entirely possible that this issue wouldn't exist without that event. :smile:

Analyzing the whole thing from a mathematical perspective: there is a surjective morphism from Int to Bool, but because it's not injective, there's nothing that makes the reverse mapping True -> 1 more distinguished/preferable than merlyn's True -> 42, or many BASIC dialects' True -> -1.

masak commented 8 years ago

or many BASIC dialects' True -> -1.

In fact, if you're into the two's complement number format, then True -> -1 is the mapping that makes the most sense, because then -1 is the number whose binary representation has all its bits set. (Just like 0 has all its bits unset.)

Not that I enjoy the thought of True numifying to -1 any more than to 1, but still...

masak commented 8 years ago

Idle question: even given all that's been said above about distaste for Bool as a numeric type — should it have a built-in ordering? Should False < True work and evaluate to True?

I'm tempted to conservatively go with "no, it shouldn't work", at least until we have an answer that fits into all enum types.

masak commented 5 years ago

I keep being fascinated by the question of how int-y booleans should be. In my opinion, there are two forces applying to the problem at the same time: purity (booleans are their own thing, and not related to integers) and convenience (sometimes we bloody well want to convert False to 0 and True to 1, like in the Iverson bracket).

I got to thinking about this because (a) I just wanted to do this in Lua in PICO-8, and (b) I found this SO question. I don't like the automatic conversion (in most languages), but I also find I don't like the lack of it (in Lua) and the subsequent hacks to get it.

masak commented 2 years ago

In EWD1070, Dijkstra points out two interesting things:

I don't know, I'm still fascinated by the interplay here. A true bikeshedding topic.

masak commented 7 months ago

Yeah, I keep coming back to this, like a moth to a flame.

George Boole's invention of "the boolean domain" was a radical one, and only obvious in retrospect; maybe not even in retrospect.

What Boole did was ask "what if truth values are somewhat like numbers?". I haven't read his original texts, but I don't think he would have balked at thinking about false as 0 and true as 1.

It's a truism that real numbers are the spoiled sibling in the family of algebraic structures. Extrapolated from counting numbers and rational numbers, they played an important role in calculus/analysis (and therefore physics) for hundreds of years, and they have by far the strongest claim to being a "natural" algebraic structure for physical reality. (Compared to, say, p-adics, surreal numbers, or even complex numbers (which are a strong contender).)

Booleans are a semiring, which is to say, they have some algebraic properties we know and love from the reals. (Semiring; due to this term being ambiguously used, sometimes also called rig (a ring without the "n" for negation, get it?) or a dioid (two monoids; the additive one and the multiplicative one).) Importantely, the distributive law says that the two monoids are sensibly related and not just doing their own thing.

But the Boolean algebra also has properties we don't find in the reals. For one thing, both "addition" (||) and "multiplication" (&&) are idempotent, which means that they are saturating: whereas you can keep adding a positive number or multiplying a positive number and get higher and higher results, when you "or" or "and" with the same Boolean value N times, you get the same as what you got after the first time.

Secondly, there's a De Morgan duality going on between true/false, between ||/&&, etc. In the end, this means that for Booleans, not only does multiplication distribute over addition, but also vice versa. I remember that in one of his texts, Dijkstra especially makes a point of this, that a too-narrow focus on the Booleans as a numeric rather than logical structure makes it easy to miss this duality and additional distributivity property. (I think he also argued that this means that ∧ (logical and) shouldn't bind tighter than ∨ (logical or); this makes sense to me, but it's not clear that we made the wrong choice in modern languages with && being tighter than ||, and both being short-circuiting.)

So, where does that leave us? I think this is a good way forward: