NULL/nil/none/void/... value considered harmful (a "billion dollar mistake" of Tony Hoare)

This post will be written with XL being conceptual language in mind. At some places it might sound differently, but I've chosen to use that terminology to be clear about the intent. The concepts I'm presenting are though generic and apply as well to XL as a conceptual language.

I believe everybody is familiar with the "billion dollar mistake" (as Tony Hoare calls it). I saw there is a bunch of recent commits (https://github.com/c3d/xl/commit/c3dde14bd8f7274909278194ca545b735cd09e0e , https://github.com/c3d/xl/commit/68b258dbf121e78a94c06bc71084476ac0d91a72 , https://github.com/c3d/xl/commit/3da52b42c4c5ebbe335483e22e09fe45ad1fdb93 , https://github.com/c3d/xl/commit/11ed0bda32bd4c7fcbad4c56a65c185d80f2788a ) leveraging nil functionality.

I'd like to oppose using nil value. The question is though "how to deal with such interfaces"?

My answer is "use ephemeral optionals". First "optionals" is a well known concept (Haskell's Maybe, Rust's Option, V's ?, Zig's ...). Despite all the pros (being dead-simple to reason about even in highly parallel systems, no stack unwinding like with exceptions, universal - can be used for errors or alternative values of any kind etc.), it brings some serious usability issues.

Namely it "reifies" the fact whether it's a value or error and thus feels like a "primed bomb" you're tossing around. So what does the ephemeral mean? That means just one thing - the primed bomb can't be assigned to anything. In other words it can be only returned and directly at the caller site has to be dealt with.

Sounds still too inflexible? Well, not anymore if you provide some sugar covering the most common use case - namely propagate (return) the optional as I don't want to handle it here yet. V lang does this perfectly - you just need to write ? after a function call and it'll propagate it. And to handle the ephemeral optional you just append or { print( err ) } to the function call to "catch" the optional and work with the data the callee has put into the optional (which is under the hoods a struct holding arbitrary data).

All in all I'd like XL to get rid of a "lonely" nil value completely and instead use this ephemeral optionals trick together with e.g. sum types (aka tagged unions aka variants). Note there should definitely exist a nil type (a type can't be assigned to anything in run time because it's a type :wink: and thus doesn't appear in run time) - e.g. for tree structures - but no lonely nil value. So a function returning sum type T | nil (the value is either of type T or of type nil - note here nil is not lonely but "bound" to T) will be something totally different than ?T (aka "optional T" - the value is either of type T or an ephemeral error).

Note one can easily have ?(T | nil). E.g. a function which shall return the leftmost child of the leftmost tree node at level 5 - if the tree depth is 4 or less, it'll return an error otherwise it'll return the value of type "sum type of T and nil" which can be assigned to a variable unlike a lonely nil (this assumes it's impossible to define a sum type with less than two types - otherwise one could again create the generic nil bomb by defining a sumtype consisting of only nil type). In other words nil as value actually exists under the hood, but always as a "special" case of some more important value thus providing a compile time explicit guarantee that all such nil values under the hood will be handled and will not leak anywhere.

A bit related is the fact, that nil value should never be used to designate an all-encompassing "alternative value". Not having nil value at all (but only a nil type) is a simple measure to ensure this. Thus e.g. all the I/O functions in one of the commits I refer to above should return some optional instead of nil.

Of course this ephemeral optionals mechanism is totally agnostic from any exception-like or effect-like or Dylan-like or any other mechanism for dealing with alternative computational branches. So no need to do anything on that side.

Any thoughts on this?

@dumblob, thanks for the insightful remarks.

First, please do not reach any conclusion on the design of the language based on any particular commit, in particular right now as I am refactoring tons of things 😊😊

Second, the general topic of error handling is described in the language design document here: http://c3d.github.io/xl/#error-handling. It is quite insufficient, but the general idea is I believe very similar to what you suggested, including the notation for what I called fallible types. The example given in the documentation for square roots is:

    R : real? := sqrt X

Where real? is a shortcut for real or error.

Ultimately, the intended return value for most I/O functions will be ok (I don't like that name, still thinking about a better one), defined as nil?, which is itself nil or error. So either the function returns nothing, or it returns an error. I am not far enough to support that type yet, so for now it is nil, but that's just a transient state.

So let me focus on the points where I see a divergence compared to what you wrote.

In XL, nil is not a null pointer, so the language itself avoids the billion dollar mistake. There is a nil type and a nil value. The nil type contains a single value, nil, which is a storage-less value. In XL, Option<T> would be T or nil, but that does not imply a null pointer, except as a possible optimization when T is represented by a non-null address.
What may have confused you is that there was a recent optimization to unify in the parse tree nil with a null pointer, precisely because the parse tree follows rules which make the optimization above valuable, notably with respect to how XL represents symbol tables. Don't read too much into these changes.
I understand the benefits you see in not "reifying" error values, but in the case of XL, the issues you point out are handled by a special evaluation rule which states that all but the last elements in a sequence must evaluate as either nil or an error value. If it evaluates as nil, then the next statement is evaluated. Otherwise, the error is returned. You can see this as Rust's ? being implicit based on the type of the returned value. An example of use is given for reading a value from a file.

So in short, I believe that the XL error type, combined with the special evaluation rule, has the benefits of your ephemeral optionals, but instead of ephemerality coming from a restriction ("you can only return them"), it comes from a positive feature that programmers can use ("if you don't deal with an error, it is returned, and the type system checks that someone catches it, but you can also easily store it if you want").

I hope that I addressed the essence of your remarks. If I missed some subtle aspect of it, let me know.

Thank you for the thorough answer. XL is so cool and unique that I'll need to think it through. I'm though really excited so far!

I'll start my thinking process with implications of "but you can also easily store it if you want" in the context of XL (in other languages this is what causes all the problems - hidden nil, if-err-then-else boilerplate, not seeing in the code that something might be actually an error and thus unwillingly not handling it, the extent to which the compiler enforces handling of errors, etc.).

Second thing I'll think about is the "in XL nil value (not nil type) can be used alone instead of always with some other value" and its implication on the use of the nil type in XL.

First question - is the := notation just a syntactic sugar for "immutable is" (i.e. identical with the very only difference that the identifier can't be "redefined" any more in the current nor any nested context)? If not, what are the differences? Depending on the answer to the previous question, I might also ask how could the compiler prevent one using is instead of := and vice versa?

Second question is how do I ensure the function I'm currently implementing is a pure one if nil is being automatically propagated breaking the purity?

Third question is whether a variable holding an error value is being immediately automatically propagated or first at the place where it's being used (either being written to or read from)?

If the latter, then what's the behavior if I'll pass a real? variable to a function accepting only real?

I understand it's the first case according to your description in point (3) in your comment https://github.com/c3d/xl/issues/46#issuecomment-864176500 above. But bare with me - I'll be loud with my thoughts :wink:.

Fourth question: could you (dis)approve my understanding that nil value fundamentally exists because it's impossible to make a function returning just the type error alone without any additional "meaningful type" (e.g. because error type is always an error and can never have the meaning of "no error") - thus the desire to represent the "non-error" result (because an error type can't convey this information) even for functions which don't return anything (but may fail - thus error).

In other words XL treats nil as always a non-error information. The question is whether there is any sensible use case to store the information "hey, that function was a procedure and ran successfully" (i.e. allow assignment of nil to a variable)? I'd say there is no need for it as this is an implicit concept in synchronous execution of code. It's implied by the fact that the next statement/expression executes.

This assumes nil value is not coercible with any other value (in other words that nil value truly represents the concept of void/nothing). Not even coercible with true/false nor integer nor enum nor anything else.

And because it doesn't provide any useful information, I'd rather disallow it for safety reasons (one might get easily fooled e.g. by blindly saving the result of the call and passing it further along thinking it's a useful information leading just to confusion).

From my understanding nil value is good just for returning it (because of the special evaluation rule), but not storing/assigning it nor for passing it to some function.

Are these observations correct? Could you correct them if not? Thanks!

First question - is the := notation just a syntactic sugar for "immutable is" (i.e. identical with the very only difference that the identifier can't be "redefined" any more in the current nor any nested context)? If not, what are the differences? Depending on the answer to the previous question, I might also ask how could the compiler prevent one using is instead of := and vice versa?

Frankly, I'm still thinking about the semantics there.

The part that is pretty much settled is that is is intended to be the definition of a constant. It is also a declaration, and like any XL declaration, is visible from the code that precedes it in the same scope. Notice that you wrote "can't be redefined anymore in the current nor any nested context", and the part "nor any nested context" is still being evaluated, but likely to be true. In other words, you probably won't be able to have two nested for loops both defining a loop variable named I. End of digression.

By contrast to is, := is an assignment, but id does not declare the variable. It overwrites what is on the left, which cannot be constant and must already exist. And that's where the troubles start. The problems being a) how you declare something that is variable and b) how you deal with ownership.

For problem a), the current thinking is to use the type annotation notation to declare variables, i.e. X:integer. There are problems with that, e.g. I'd prefer if a parameter X:integer was not variable unless explicitly marked as out or in out, so now X:integer has a different meaning in a parameter list and freestanding. Much like C, but I still don't like it. A second issue is whether we require initialization, e.g. X:integer := 0, or whether there is a default value if not given. A third issue is that I like the Go approach of using := for initialization and = for assignment, except that I don't like their notations. I have toyed with the idea of <- for assignment and := for variable declaration. Or with the idea that the first time a name is used in a scope, it implicitly declares the variable, but that would be so error-prone! Or var X is 0; X := 1. And so on. So many possible choices.

For problem b), there are at least two different semantics, copy for something like X := 0.3 and move for something like MyPicture := PictureFromFile "foo.png". The current documentation suggests that :+ would force copy, :< would force move, and := would be the best choice for the type. I wrote that documentation, but I really don't like it much, so still looking for better choices. Another option would be move X := Y if you want to force a move.

Bottom line, what distinguishes := and is is whether the value is constant or not. That's how the compiler can help you figure out errors.

That was your first question. On to the next ones ;-)

Second question is how do I ensure the function I'm currently implementing is a pure one if nil is being automatically propagated breaking the purity?

nil is not automatically returned or propagated, only error is.

A function is pure if it has no side effect. This is what you are declaring when you mark a form with the function sugar. A compiler is encouraged to emit a diagnostic if you write this:

function IsEven(N) as boolean is
    print "Testing if ", N, " is even"       // Warning: function as side effect
    N mod 2 = 0

The way to detect this is to check if the things you call within a function also have a function sugar attached to the declaration.

Third question is whether a variable holding an error value is being immediately automatically propagated or first at the place where it's being used (either being written to or read from)?

It's really a matter of how you consume it.

If you have E is error "Ooops", then:

print "E=", E

will print the error, because print takes error as a possible overloaded type.

On the other hand, E + 1 will cause a compile-time type mismatch, since there is no form that adds an integer and an error. That is, unless you add code like:

E:error + 1 is 2

Same if you write sin(sqrt(X)) and X can be negative. That is a static type error, and you have to either address the possible error value from sqrt, or add an error-propagating sin by adding a declaration like the following to the standard overloads:

sin E:error as error is E

If you do that, the type for sin(sqrt(X)) becomes real or error (aka real?).

If the latter, then what's the behavior if I'll pass a real? variable to a function accepting only real?

It should not compile. You need to handle the error. You can brute-force it with something like sin(try sqrt(X) catch 0.0), which will return sin(0.0) in case of error.

I understand it's the first case according to your description in point (3) in your comment #46 (comment) above. But bare with me - I'll be loud with my thoughts 😉.

At the moment, none of this is implemented, and implementing it often reveals design errors.

Fourth question: could you (dis)approve my understanding that nil value fundamentally exists because it's impossible to make a function returning just the type error alone without any additional "meaningful type" (e.g. because error type is always an error and can never have the meaning of "no error") - thus the desire to represent the "non-error" result (because an error type can't convey this information) even for functions which don't return anything (but may fail - thus error).

Actually, nil exists because I need a terminating type that represents "no data" in a variety of cases. Notably in generic code, it may make sense to write "Nothing: nil` for example in a data structure.

In other words XL treats nil as always a non-error information. The question is whether there is any sensible use case to store the information "hey, that function was a procedure and ran successfully" (i.e. allow assignment of nil to a variable)? I'd say there is no need for it as this is an implicit concept in synchronous execution of code. It's implied by the fact that the next statement/expression executes.

That observation is true for procedure evaluation. But consider the expression []. That's a block with what inside? The XL answer is that what is inside is nil.

This assumes nil value is not coercible with any other value (in other words that nil value truly represents the concept of void/nothing). Not even coercible with true/false nor integer nor enum nor anything else.

Nothing is coercible in XL without you giving an implicit conversion operator (something like X:integer as real is .... It would probably be bad practice to convert any data type to nil, although nothing in the language precludes it.

And because it doesn't provide any useful information, I'd rather disallow it for safety reasons (one might get easily fooled e.g. by blindly saving the result of the call and passing it further along thinking it's a useful information leading just to confusion).

This can be useful in the case of generic code. Consider the following:

Call(Callee, Args) is Callee(Args)
Call(print, "Hello, sin(3)=", Call(sin, 3.0))

From my understanding nil value is good just for returning it (because of the special evaluation rule), but not storing/assigning it nor for passing it to some function.

Are these observations correct? Could you correct them if not? Thanks!

I think they apply to the non-generic procedure case.

That being said, nil is an interface is rarely seen. The procedure and to syntactic sugar disappear it.

c3d / xl

NULL/nil/none/void/... value considered harmful (a "billion dollar mistake" of Tony Hoare) #46