dotnet / csharplang

The official repo for the design of the C# programming language
11.05k stars 1.01k forks source link

Proposal: never/bottom type #538

Open orthoxerox opened 7 years ago

orthoxerox commented 7 years ago

Previously discussed in https://github.com/dotnet/roslyn/issues/1226.

A new type that has no instances and is a subtype of every other type can be introduced, called never or bottom. It will be the official return type of the throw expression and will allow transforming return, break, continue and goto into expressions (see https://github.com/dotnet/csharplang/issues/176). This will allow them in all expression contexts where never is combined with another type (&&, ||, ??, ?:) or discarded (expression statements). Instantiating variables/fields of type never should be impossible.

It can either be implemented as a real CLR type (well, as real as void) or as a compiler trick (never-returning methods are actually void with a NeverReturnsAttribute).

MI3Guy commented 7 years ago

It would be really nice to be able to use never with variance. Then we could use IReadOnlyList as any type (or just class types?). This would, of course require CLR support.

HaloFour commented 7 years ago

@MI3Guy

I don't understand how "never" could work in generic containers? Would IReadOnlyList<never> indicate an empty list or something?

MI3Guy commented 7 years ago

@HaloFour

That's the idea. It would represent an empty list. It would only be possible to use never in covariant/out positions, of course. I tend to use it in Scala with ADTs/sum types for cases that represent errors. Perhaps not the most necessary feature as you could introduce a type parameter, but it more directly expresses intent and can avoid allocating duplicate objects as a single instance can be shared.

HaloFour commented 7 years ago

@MI3Guy

Java manages the same without a "never" type. That only really works in the JVM due to generic type erasure and the lack of generics over primitives. With the CLR IReadOnlyList<string> and IReadOnlyList<int> are fundamentally different and the latter is not permitted to participate in variance as a result. Between reference types they are more equivalent but I believe that's considered a JIT implementation detail and not one that can necessarily be relied upon.

orthoxerox commented 7 years ago

@MI3Guy could you please give an example? @gafter has mentioned variance in the original issue as well, but I find it hard to wrap my mind around the idea.

MI3Guy commented 7 years ago

@orthoxerox For instance:

public interface IResult<out T> { }
public sealed class ResultValue<T> : IResult<T> {
    // ...
    public T Value { get; }
}
public sealed class ResultError : IResult<never> {
    public string ErrorMessage { get; }
}

You could also add TError and make ResultValue specify never as that type.

HaloFour commented 7 years ago

Well, you also have that pesky issue of classes not allowed to be variant, only delegates and interfaces.

MI3Guy commented 7 years ago

True. I'll make it an interface.

DavidArno commented 7 years ago

Is there a reason why never would be needed here, rather than just using void (and thus, potentially allowing any statement to be treated as a void-returning expression, not just throw, break etc,)?

DavidArno commented 7 years ago

@MI3Guy,

I'm clearly missing something over your use of never in generics as the following can be done already:

public sealed class ResultError : IResult<object>
{
    public string ErrorMessage { get; }
}

but - presumably - that isn't what you are trying to achieve in your example.

MI3Guy commented 7 years ago

In that case, ResultError can only ever be used as an IResult<object>. It couldn't be used as a IResult<string> (or whatever class you wanted to use). You could, of course, declare ResultError with a type parameter.

public sealed class ResultError<T> : IResult<T> { ... }

But as I stated before, the use of never states more directly that ResultError does not actually contain a T and also having to specify the type parameter may result in more instances being created.

DavidArno commented 7 years ago

In your example, you have made IResult<> covariant. Make it contravariant and then IResult<object> can be used in place of IResult<string>:


public interface IResult<in T> { }
public sealed class ResultValue<T> : IResult<T>
{
    // ...
    public T Value { get; }
}
public sealed class ResultError : IResult<object>
{
    public string ErrorMessage { get; }
}

_ = new List<IResult<string>> {new ResultError()};

But again, I guess I'm just missing the point.

MI3Guy commented 7 years ago

Switching to contravariant makes ResultValue invalid. The Value property uses T in a covariant position. The never type is essentially the opposite of the object type.

DavidArno commented 7 years ago

The never type is essentially the opposite of the object type.

Oh! "A new type that has no instances and is a subtype of every other type". Got there in the end. Yes, I now get how it would work with covariant types. That's really nice: I could really use that! 😀

I'm not sure that bottom is a good name therefore as that implies the lowest base type (and being a biologist first, programmer second, I will always struggle to picture roots being at the top of a tree). So never works better IMO.

orthoxerox commented 7 years ago

Well, bottom is its official name in type theory, but since we call futures tasks I don't mind it being called never.

gafter commented 7 years ago

/cc @cston

JoergWMittag commented 7 years ago

Since @DavidArno stumbled over this, maybe it would be a good idea to make it more explicit:

A bottom type is a type which has

The bottom type represents a computation that doesn't return, e.g. an infinite loop, an error, an exit, etc.

Contrast this with a unit type which has

The unit type represents a computation that returns nothing. (In many languages, the unit type is written simply as the 0-tuple type and the unit value is written as an empty tuple, and in fact, it is isomorphic to an empty tuple: an empty tuple conveys no useful information, and it has only one value, or, all empty tuples are the same and indistinguishable.)

  unit bottom
returns nothing doesn't return at all
value singleton no values
represents side-effect non-termination / abnormal termination

Here's a question: what is the type of an empty list?

In Haskell, an empty list has the polymorphic ("generic" in C♯) type [a] (that's IEnumerable<A> in C♯). The empty list ([], pronounced "Nil") is a polymorphic value. But, in C♯, values can't be generic, only static types of references can.

So, second try: IEnumerable<object>, then? Well, at first glance it seems to make sense. But! We surely should be able to assign the empty list to a variable of type IEnumerable<string>, no? After all, an empty list is a list of (no) strings. However, IEnumerable<T> is covariant in T (as it should be, we don't want to, and in fact can't make it covariant), which means IEnumerable<object> is a supertype of IEnumerable<string>. Bummer.

If we want to be able to assign the empty list to a list of strings, we need something that is a subtype of string. If we want to be able to assign the empty list to a list of integers, we need something that is a subtype of integers. And so on … we need something that is a subtype of T for all T: and that's bottom.

In Scala, bottom is called Nothing, which captures this particular usage nicely. What is the type of the empty list, i.e. a list of nothing? Well, in Scala, it's List[Nothing].


ChangeLog:

Sorry, @svick, @yaakov-h, @HaloFour for the confusion, I used List more as a generic (hah!) example, I didn't specifically mean the List BCL class. I replaced it with IEnumerable<T> now, which is the real equivalent to a Haskell list anyway (being (potentially) lazy).

lachbaer commented 7 years ago

@JoergWMittag Nice explanation! ,Nothing in VB is null in C#. List<null> would make sense, wouldn't it? Or are there situation where the null keyword as a null/bottom type would conflict?

svick commented 7 years ago

@JoergWMittag System.Collections.Generic.List<T> can't be covariant (even if I ignore the fact that no class can be covariant in .Net), because it's mutable. It's not a good counterpart to Haskell's [a].

Though it could (in theory, again no class can be variant) work with FSharpList<T>, or possibly ImmutableList<T>.

yaakov-h commented 7 years ago

Why can't you have List<nothing> just because it's mutable? If nothing has zero values, you can never add an item to the list, so it will always be empty...

HaloFour commented 7 years ago

@yaakov-h

Variance in the CLR is enforced at the type level, not the method level. A List<T> supports read and write operations so it is inherently invariant. There's no facility to allow variance by forbidding specific method calls, like there is in Java or the like.

svick commented 7 years ago

@yaakov-h You could have List<nothing>, but you couldn't use variant conversions on it, which is what @JoergWMittag was talking about.

Consider:

List<nothing> listOfNothing = new List<nothing>();

List<MemoryStream> listOfStream = listOfNothing;
listOfStream.Add(new MemoryStream());

List<string> listOfString = listOfNothing;
Console.WriteLine(listOfString[0]);
mattwar commented 7 years ago

Can someone explain how having a bottom/never type would benefit the language and its users?

orthoxerox commented 7 years ago

@mattwar

  1. It will allow to explicitly mark functions that never return. This will potentially allow CLR to reuse the frame on the call stack if someone writes (or automatically rewrites) their programs in CPS. It will also allow the compiler to find additional unreachable code.

  2. It will make break, continue, throw etc expresssions' treatment by the compiler straightforward.

mattwar commented 7 years ago

@orthoxerox Are functions that are known not to return common enough that we'd need special language syntax to call them out? As for your #2, the compiler can model these branch expressions without the need of introducing new language keywords.

gafter commented 7 years ago

@mattwar There are a number of such methods in the BCL, but that is not the principal motivation for this feature. It is more motivated by generics scenarios where there is a top type (object) which is very useful with generics, thank you very much, but there is no corresponding bottom type (this proposal). Also it would be particularly useful to expand the set of expression-based constructs to include those traditionally done using statements.

mattwar commented 7 years ago

@gafter assume I'm a user of generics, and writing a generic method or class. How is a bottom type going to affect the code I write? Where does it show up?

JoergWMittag commented 7 years ago

@mattwar: Assume you write some sort of immutable container. Think IEnumerable<out T>, IReadOnlyList<out T>, or a potential future addition to the BCL of IOptional<out T>. Especially for IOptional<out T>, you want to have an empty value, usually called None. You don't want to have multiple empty values, i.e. an empty None<int>, None<string>, None<Foo>.

If you want a single value that you can assign to a reference of type IOptional<int>, IOptional<string>, IOptional<Foo>, etc. then that value must be of a type that is a subtype of IOptional<T> for all T. Currently, you cannot have such a value. (Except null, which doesn't really have a type, but if it did have a type then its type would be a subtype of all reference types.) However, if you had:

class None : IOptional<nothing> {}

where nothing is our hypothetical bottom type, then a singleton instance of None could be used as a single unique value that denotes the absence of an optional value.

Similary, if you want to have a single empty list to which you can polymorphically prepend an int or a string or a Foo to get back an IMyImmutableList<int>, IMyImmutableList<string>, or IMyImmutableList<Foo>, then you want your singleton empty list to be of type IMyImmutableList<nothing>.

Or, put it another way: null is occasionally useful, albeit badly implemented. The "type" of null is a bottom type (for reference types). So, clearly a bottom type is useful. Exposing it to the programmer means that you can write null-like abstractions.

Are functions that are known not to return common enough that we'd need special language syntax to call them out? As for your #2, the compiler can model these branch expressions without the need of introducing new language keywords. [bold emphasis mine]

Actually, the whole point of this proposal is to make it possible for library authors to implement these abstractions so that you don't have to introduce "special language syntax" or "new keywords". We just want to add one type to the BCL, and one rule to the type system.

Think about it this way: we do have "special language syntax" and a keyword because of the lack of a Unit type, namely the void keyword for methods that don't return a useful value. The lack of a proper Unit type also leads to code duplication (see Func vs. Action). We want to avoid the same mistake with a bottom type.

whoisj commented 7 years ago

So this is akin to Rust's -> ! notation, which basically says "when you call this function, processing terminates?

DavidArno commented 7 years ago

@JoergWMittag,

Your example shows what I'm beginning to feel is a serious flaw to this idea. My first thought on how this feature could be used was for the framework to contain:

public static readonly IEnumerable<never> empty = ...;

as a single occurrence of all empty collections, lists etc. That was fine, so then I thought about the idea of a single none for all Option<T>'s, but I just couldn't work out how that could be coded. The reason why is that I wasn't introducing a new interface. Option<T> makes sense as a struct, but even if it's a class, most implementations would be immutable. Creating an interface for a single immutable type seems like a code smell to me. So I really don't like IOption<out T>. It's too high a price to pay just to save on a few extraneous instances of none.

I've not really looked into the reason why only interfaces can declare variance, but to my feeling is that this feature will have a vanishingly small set of use cases unless variance on structs and classes can be introduced.

JoergWMittag commented 7 years ago

@whoisj: I am not sure. Is ! a bottom type in Rust? Chapter 7.2 Subtyping of the Language Reference makes no reference of ! and neither does the section on Function Types in chapter 7.1 Types – in fact, I couldn't find any mention of it anywhere in the entire reference.

The section on Diverging Functions in chapter 3.2 Functions of The Rust Programming Language mentions it, but doesn't explicitly say that it is a bottom type either. The last paragraph says: "A diverging function can be used as any type" but it is not quite clear what that means exactly in terms of typing rules.

It is not quite clear to me whether ! is a proper bottom type. Can I create a singleton instance of ImmutableList<!> and assign it to a variable of type ImmutableList<T> for any and all T?

I don't think so. The section on subtyping says that Rust doesn't really have subtyping (except for lifetimes), and the section on diverging functions says that ! is "special syntax", which is precisely what this proposal does not want. That's what we have with void for methods that return no useful value and it's a mess. (Rust and C♯ are kind-of duals in this respect: Rust has a proper unit type, C♯ doesn't, instead it has special syntax and semantics for void. This proposal is about adding a proper bottom type, Rust instead has special syntax and semantics for !.)

DavidArno commented 7 years ago

So in answer to my question on why variance is only on interfaces, I found this SO answer by Eric Lippert. As Eric says:

  1. Contravariance on a class makes no sense as a write-only object isn't much good for anything,
  2. Covariance on classes didn't make much sense back in the days of C# 2 when generics were introduced as immutable data types were rarely ever used in C# back then.
  3. These days things are very different and lots of us create immutable generic types so covariant classes make a lot of sense now days.

So in my view at least, it makes total sense to add support for covariant classes (and structs? Haven't found any details on whether that's possible yet) at the same time as adding the never type. Then this becomes a really nice feature.

whoisj commented 7 years ago

@JoergWMittag Rust's ! is rather useful for the compiler to understand that a given function will terminate the process. Very similar to throw in C#, except it's at the function boundary.

After reading @gafter explanation, ! and bottom are not the same thing, though there is some potential overlap. 😕

mattwar commented 7 years ago

I'm still getting the impression that this feature is being requested out of some sense of completeness with regards to a type system and not for any practical purpose.

JoergWMittag commented 7 years ago

@mattwar: It's always hard to speculate about the future, and what might or might not happen if some type did or didn't exist. Some numbers from Scala: in the standard library, there are 37 instances of Nothing being used in a generic type (i.e. SomeGenericType[Nothing]), 9 instances of it being used as a return type of a method, 8 instances of it being used as the parameter type of a lambda, 2 instances of it being used as the return type of a lambda, and 4 instances of it being used as a lower type bound. @gafter already mentioned that there are existing methods in the BCL that could benefit from returning bottom.

Also note that "some sense of completeness" and "practical purpose" are not disjoint. If you have a bottom type, then you know that you always have a lower type bound, no matter what. This can not only simplify the type system, it can more importantly simplify algorithms operating on types, which in turn translates into simplifying code within the compiler. For example, when computing a lower bound, you can remove all guards about not finding one, because you know there's always one. It is a rare thing that you can remove complexity by adding a feature!

HaloFour commented 7 years ago

@JoergWMittag

I'd like to see those cited use cases that are specific to the CLR. I don't trust the numbers from Scala (or Java) because their generic implementation is drastically different which enables different scenarios.

Without practical use cases that would work with the CLR it's really hard to judge the usefulness of this proposal. I don't think you could have a List<nothing> in C# and expect it to behave anything like a singleton empty list like you can in Java/Scala.

DavidArno commented 7 years ago

@HaloFour & @mattwar,

Some examples of how this type could be used:

  1. Environment.Exit and the like naturally return Never, ie they don't return. Likewise, one might have a method who's sole job is to act as a formatting wrapper around a throw. By changing the signature of such methods from void M() to Never M() the compiler and analysers can know that these methods do not return and can treat code that follows as unreachable.
  2. By having break, continue, goto, throw and the void version of return all be able to act as expressions of type Never, the current limited "throw expression" feature could be expanded to have any of those terms act as part of any expression. For examples of how this could be used, see Proposal: return, break and continue expressions.

However, I find myself struggling to reconcile @gafter's claim that this feature is mainly "motivated by generics scenarios" with finding anyway that this feature could be used with types in the BCL. Even if covariance were added to classes and structs, all the obvious benefactors are either directly mutable, or they implement interfaces that promise mutability. Even value tuples have been made mutable, so even they couldn't benefit from this either.

If covariance rules are relaxed and yet another whole new set of immutable collection types were added to the BCL and yet another equivalent to .net standard were created that promoted these types over historic collections and the "mutable by default" mantra were scrapped, then this feature could be important for .NET generics. But it would be easier to u-turn an oil tanker in a paddling pool than to get Microsoft to make such a radical change. So I think @gafter wrong. This feature will likely only show up in user-defined generics and would mainly be used for points 1 & 2 above.

HaloFour commented 7 years ago

Sure, both of those examples are useful, but I don't believe that either require CLR changes. As demonstrated by C#7, the compiler can already consider control statements as expressions. I don't see why that couldn't be expanded to return, break or continue in the same manner. As for non-returning methods like Environment.Exit, why couldn't an attribute suffice?

I believe that there are compelling use cases here, but not with never on its own. Changes to variance at both the CLR and compiler level could enable the Java-like scenarios. But beyond that I don't see the big advantage here. If I'm missing something I welcome enlightenment.

DavidArno commented 7 years ago

@HaloFour,

Completely agree on all those points. I started off really excited by this idea, but then my enthusiasm has waned as every scenario for its use (save for things that could be achieved without Never) has proved problematic.

alrz commented 7 years ago

A more general alternative to never type is https://github.com/dotnet/csharplang/issues/105 if its static analysis affect control flow.

void JustThrow() ensures false
{
  throw ...;
}

The ensures clause dictates that the followed expression must be statically proved to be true when the control leaves the method. So, ensures false means that the control must not leave the method.

Then, the compiler can detect unreachable code or produce appropriate warnings based on metadata emitted by method contracts.

JustThrow();
// unreachable code

The extended static analysis here allows to imitate a "conditionally never" type, for example,

void Assert(bool condition) ensures condition
{
  if(!condition)
    throw ...;
}

The compiler makes sure that the control does not leave the method if it's not statically proved that condition is true, therefore an invocation of Assert(false) causes the subsequent code to be unreachable.

Furthermore, non-nullable reference types are also a special case of method contracts, providing a concise syntax for requires param != null, etc.

whoisj commented 7 years ago

@DavidArno @alrz

Environment.Exit and the like naturally return Never, ie they don't return. Likewise, one might have a method who's sole job is to act as a formatting wrapper around a throw. By changing the signature of such methods from void M() to Never M() the compiler and analysers can know that these methods do not return and can treat code that follows as unreachable.

Hence my query above about the relationship between bottom and Rust's ! return type. This seem eminently useful, especially for code analysis, as it conveys the actual meaning.

DavidArno commented 7 years ago

@whoisj, Having now read up more on both, I feel we can say that one of the purposes of a bottom type is to achieve the same as -> ! in Rust. It has other uses too, though.

But as @HaloFour and @alrz say, that feature could be achieved just through compiler changes, rather than the CLR change that Never would require.

svick commented 7 years ago

@alrz I would be wary of tying contracts to the definition of unreachable code.

With contracts, it's often not clear what exactly can the static analysis infer. I think that unreachable code, and especially definite assignment (which is related), should be very clearly defined. It should not depend on the quality of the static analysis of contracts.

alrz commented 7 years ago

It should not depend on the quality of the static analysis of contracts.

It actually does. Currently subsumption checking is limited to switch statements. A smarter flow analysis would detect such cases in other places too.

if (x is IEnumerable<char>) return; 
if (x is string) { /* unreachable code (i.e. expression is always false) */ }

This could be done as part of warning waves and it certainly helps to find bugs sooner than later.

dougmill commented 7 years ago

Here's another use case I'm surprised to see no mention of: when you just don't care what the type parameter is because the specific code involved does not directly interact with the methods that use the generic parameter. In many cases this can be handled by making your class or method generic and passing the buck to whatever calls it, but that doesn't work when the same object needs to interact with multiple different variants.

For an example from actual code in an active project, I have a Dictionary that has a covariant type for the keys, and I use it strictly for looking up the values by key - and other code may call this with objects of arbitrary different variants. With it being covariant, I can handle this like so:

Dictionary<CovariantType<object>, ValueType>

If my key type were contravariant instead, I have no corresponding option. I can work around it by defining a non-generic parent interface, but that is not always an option, in particular when the type is coming from a library I don't control. With this proposal implemented and working with variance, I would be able to do this:

Dictionary<ContravariantType<never>, ValueType>

This still wouldn't work with invariant types, though - that would need something like Java's ? wildcard, which would actually be preferred for both of these cases too.

tannergooding commented 6 years ago

I think, as mentioned in a similar proposal here: https://github.com/dotnet/csharplang/issues/739, the primary use case for such a feature would be around definite assignment (which is possibly more relevant today with the scoping rules for pattern matching).

CyrusNajmabadi commented 6 years ago

Dictionary<ContravariantType, ValueType>

This sounds like it would need runtime support.

gulshan commented 6 years ago

I think making literal null having a type null, which is a bottom type to all reference types will be useful for "nullable/non-nullable references" feature.

Richiban commented 6 years ago

One of the easiest-to-understand examples of where you would use a bottom type is with the Option discriminated union. Let's take the F# definition:

type Option<'t> = None | Some of 't

let a = None

What is the type of a?

If we have a bottom type called Undefined (that's what Haskell calls it and I think you'll agree later that it reads quite well), then a has type Option<Undefined>. That means that, thanks to covariance, it is assignable to any other type of option, which is correct. There's nothing wrong with the following:

let a = None

let b : Option<string> = a
let c : Option<int> = a

In a similar way, the empty list can be thought of as a singleton that has type List<Undefined>. If I go back to C# now (because I don't need to use discriminated unions for this demo):

var list = List.Empty; // This has type List<Undefined>

var first = list.First(); // The compiler can prove that lines after this are unreachable, because 'first' has type 'Undefined'
LYP951018 commented 6 years ago

What's the status of this? NeverReturnsAttribute should be handy.