fsharp / fslang-suggestions

The place to make suggestions, discuss and vote on F# language and core library features
344 stars 21 forks source link

Add nullable reference types [RFC FS-1060] #577

Open 0x53A opened 7 years ago

0x53A commented 7 years ago

RFC: https://github.com/fsharp/fslang-design/blob/main/RFCs/FS-1060-nullable-reference-types.md

[Updated description by @dsyme]

C# 8.0 is likely to ship with a feature where new C# projects opt-in by default to making reference types "without-null" by default), with explicit annotations for reference types "with-null". The corresponding metadata annotations produced by this feature are likely to start to be used by .NET Framework libraries.

F#-defined types are "without-null" by default, however .NET-defined types like string and array and any other types defined in .NET libraries are "with-null" by default. The suggestion is to make new F# 5.0 projects opt-in to having .NET reference types be "without-null" by default when mentioned in F# code, and to interoperate with .NET metadata produced by C# 8.0 indicating when types are "with-null".

Terminology and working assumptions

These terms are used interchangeably:

Likewise

For the purposes of discussion we will use string | null as the syntax for types explicitly annotated to be "with null". You will also see string? in some samples

We will assume this feature is for F# 5.0 and is activated by a "/langlevel:5.0" switch that is on by default for new projects.

Proposed Design Principles

We are at the stage of trying to clarify the set of design principles to guide this feature:

  1. We should aim that F# should remain "almost" as simple to use as it is today. Indeed, the aim should be that the experience of using F# as a whole is simpler, because the possibility of nulls flowing in from .NET code for types like string is reduced and better tracked.

  2. The value for F# here is primarily in flowing non-nullable annotations into F# code from .NET libraries, and vice-versa, and in allowing the F# programmer to be explicit about the non-nullability of .NET types.

  3. Adding with-null/without-null annotations should not be part of routine F# programming

  4. There is a known risk of "extra information overload" in some tooling, e.g. tooltips. Nullability annotations/information may need to be suppressed and simplified in some types shown in output in routine F# programming. There is discussion about how this would be tuned in practice

  5. F# users should primarily only experience/see this feature when interoperating with .NET libraries (the latter should be rarely needed)

  6. The feature should produce warnings only, not hard errors

  7. The feature is backwards compatible, but only in the sense all existing F# projects compile without warning by default. Placing F# 4.x code into F# 5.0 projects may give warnings.

  8. F# non-nullness is reasonably "strong and trustworthy" today for F#-defined types and routine F# coding. This includes the explicit use of option types to represent the absence of information. The feature should not lead to a weakening of this trust nor a change in F# methodology that leads to lower levels of safety.

Notes from original posting

Note that there are a few related proposals (see below), but I couldn't find an exact match in this repo.

One reason I am adding this proposal is that I have lately been working in a mixed C# / F# solution, where I DO need to work with null more often than I like.

The other reason is that we should be aware of the parallel proposal for C# (https://github.com/dotnet/csharplang/issues/36).

My main question is: Is there a minimal implementation for F#, that eases working with C# types NOW, without blocking adoption of future C# evolutions?

The existing way of approaching this problem in F# is ...

Types declared in F# are non-nullable by default. You can either make them nullable with AllowNullLiteralAttribute, or wrap them in an Option<'T>.

Types declared either in C#, or in a third-party F# source with AllowNullLiteralAttribute are always nullable, so in theory you would need to deal with null for every instance. In practice, this is often ignored and may or (often even worse) may not fail with a null-reference exception.

I propose we ...

I propose we add a type modifier to declare that this instance shall never be null. This will be most useful in a function parameter.

Because I do not want to discuss the actual form of that modifier (attribute, keyword, etc), I will use 'T foo as meaning non-nullable 'T, similar to 'T option.

Example:

let stringLength (s:string foo) = s.Length

Calling this method with a vanilla string instance would produce a hard error.

How can a nullable type be converted to a non-nullable?

There should be at least a limited form of flow analysis.

Two examples would be if-branches and pattern matching:


let s = ... // vanilla (nullable) string

(* 1 - pattern matching *)

match s with
| null -> 0
| s -> stringLength s

// Note that this does not change the original variable s - the new s shadows the old s. I can also use a different variable name:

match s with
| null -> 0
| s2 ->
    // this fails:
    // stringLength s
    // this works:
    stringLength s2

(* 2 - explicit null-checking *)

if s = null then
    ()
else
    // now the compiler knows s is non-nullable
    stringLength s

There probably also needs to be a shorthand to force a cast, similar to Option.Value, which will throw at runtime if the option is null.

Runtime behavior

The types are erased to attributes. You can't overload between nullable and non-nullable (may in combination with inline?) When a non-null type modification is used in a function parameter, the compiler should insert a null-check at the beginning of the method. The compiler should also add a null-check for all hard casts.

Pros and Cons

The advantages of making this adjustment to F# are ...

stronger typing.

The disadvantages of making this adjustment to F# are ...

yet another erased type modifier, like UOM, aliases.

Extra information

Estimated cost (XS, S, M, L, XL, XXL): L

Related suggestions: (put links to related suggestions here)

https://github.com/fsharp/fslang-suggestions/issues/552: Flow based null check analysis for [<AllowNullLiteralAttribute>] types and alike

Affidavit (must be submitted)

Please tick this by placing a cross in the box:

Please tick all that apply:

dsyme commented 6 years ago

Adjusted title to reflect that this will need to cover interop with proposed C# 8 non-nullable reference types https://blogs.msdn.microsoft.com/dotnet/2017/11/15/nullable-reference-types-in-csharp/

cartermp commented 6 years ago

@dsyme given that this is going to be a headlining feature for C# 8 (and all of .NET that compiles with it), I suggest we add proposed-priority as a label.

cartermp commented 6 years ago

Additional notes after speaking with @TIHan about how we might want to do this:

Syntax

Use ? in the type annotation, just like C#:

let doot (str: string?) = // Nullable
    ...

let hoot (str: string) = // Non-nullable
    ...

When the compiler detects the attibute emitted by nullable types, it will infer the reference type to be nullable. The syntax should be reflected in tooling as well (QuickInfo, SignatureHelp, etc.).

Behavior

From a "referentially transparent" point of view, the behavior of this should be the same as C#. Under the covers, C# will be doing vastly different things because they did not start with declared reference types being non-null by default like F#. We'd work with the C# team to make sure every behavioral edge case is the same, but the underlying implementation for F# is likely to be far simpler.

AllowNullLiteral

This is a strictly better solution than decorating types with [<AllowNullLiteral>]. It's very rare that someone wishes to assign null to a reference type all over the place. Rather, we think that it's ad-hoc nullability that people want. This feature accomplishes that.

Given this, a goal of this feature should be to allow F# developers to phase out their usage of [<AllowNullLiteral>] and eventually allow us to deprecate the feature in a future future version of F# (i.e., much further in the future).

Additionally, we might want to consider types that are nullable equivalent to types that are decorated with [<AllowNullLiteral>]. Nullable types that are also decorated with [<AllowNullLiteral>] would be redundant, so we could consider having a warning for that. It's hard to say how much additional work could be done here (e.g., code fix to remove [<AllowNullLiteral>] under some circumstances).

dsyme commented 6 years ago

Use ? in the type annotation, just like C#:

let doot (str: string?) = // Nullable

let hoot (str: string) = // Non-nullable

@cartermp As a starting desire that is reasonable. However there are significant problems.

One is that in F# code today string is definitely nullable, because it is from a .NET DLL. This leaves an awkward compat problem. Additionally there is a risk that ? annotations will become pervasive and awkward in F# code, because they "flow in" whenever interoperating with .NET. This will create conceptual baggage for the F# programmer.

For example, if .NET defines an interface today:

type IA = 
     abstract M : System.String -> System.String

Are those non-nullable strings or not? And can I implement that using this without getting a nullability warning?

{ new IA with 
      member x.M(s:string) = s
}

In principle the answer is "no" as we must take into account nullability:

{ new IA with 
      member x.M(s:string?) = s
}

But this breaks code. Emitting a warning for this kind of code is plausible but it would require very careful tuning. We could plausibly make it work, but nullability warnings would surely need to be off-by-default. And non-nullability information would not be emitted at interop boundaries.

As mentioned by the OP, another approach would to progressively opt-in to explicit non-nullability for .NET-defined types. So

Use ? in the type annotation, just like C#:

let doot (str: string?) = // explicitly nullable, warnings if used as non-nullable

let hoot (str: string) = // nullable but not explicitly so, no warnings associated with this

let woot (str: string!) = // non-nullable, warnings if matched against null

let hoot2 (str: FSharpThing) = // non-nullable but not explicitly so

Stepping back a bit, I think we should first try to formulate some guiding principles. Here is a starting set:

  1. F# should remain as simple to use as it is today
  2. The value for F# here is primarily in flowing non-nullable annotations into F# code from .NET libraries, and vice-versa, and in allowing the F# programmer to be explicit about the non-nullability of .NET types.
  3. Adding nullability/non-nullability annotations should not be part of routine F# programming (see principle 1)
  4. Nullability annotations/information should not appear in types shown in output in routine F# programming (see principle 1)
  5. F# users should primarily only experience/see this feature when interoperating with .NET libraries (the latter should be rarely needed)
  6. The feature is backwards compatible, all existing F# projects compile without warning by default
  7. The feature should produce warnings only, not hard-errors

I'm not saying those are the exact principles we want. But if we draw some bounding principles like these then design discussions may become tractable. In particular I think clearly articulating (2) is really important - what is the actual exact value this feature is bringing to the F# programmer? These annotations are valuable because they might help the F# programmer catch errors, e.g. where they pass a null to a C# library. Likewise C# people will be more often forced to interact with F# code using non-nullable semantics.

At the technical level, and as mentioned by @0x53A, it will also be crucial to consider how you get from string? to string, i.e. how you prove something is non-null. In F# by far the most natural place for this is pattern matching,

    match s with 
    | null -> ...
    | s2 -> ...  // s2 suddenly has non-null annotation

but pressure will also come to hack in a range of other ways to prove non-nullness, starting with

    match s with 
    | null -> ...
    | _ -> ...  // s suddenly has non-null annotation even though it is not re-bound

and then this:

    if s = null then 
        ...
    else
        ...

and then this

    if String.IsNullOrEmpty(s) then 
         ...
    else
        ...  // s suddenly has non-null annotation

and then many other variations. The second and third are vaguely plausible (but have problems) but the last and its more elaborate forms are deeply foreign to F# and the entire typed language design tradition it embodies. It is also likely to cause havoc in the F# compiler to try to implement this sort of thing.

We should also be aware that allowing FSharpThing? will create tensions. For example, it will create tension with F# option types. Suddenly users are faced with a choice about the "right" way to express optionality in F# code. Many will naturally gravitate to the one that is the least number of characters, which means there is no longer "one right way". This might mean Nullable<FSharpThing> is preferable.

0x53A commented 6 years ago

I would really like, even if off-by-default and configurable through a compiler switch, a way to insert hard null-checks at all public boundaries.

That is

let f (s : string!) =
    // do something with s

should be translated to

let f ([<TheAttributeCSharpUsesToKeepTrackOfNullability>] s : string) =
    if s = null then raise (ArgumentNullException("s"))
    // do something with s

C# explicitly opted to not do this, probably for performance reasons. But not having this at least as an option would mean that you have a type system you can't really trust at runtime.

There is no semantic impact of the nullability annotations, other than the warnings. They don’t affect overload resolution or runtime behavior, and generate the same IL output code.

(https://blogs.msdn.microsoft.com/dotnet/2017/11/15/nullable-reference-types-in-csharp/)

smoothdeveloper commented 6 years ago

@dsyme I guess you are aware of it, but want to confirm in context of your answer, C# will not consider all non annotated to be non null, unless you enable a new compiler flag, which you should have on by default on new projects, and ideally should convert projects to have (kind of the same way you don't necessarily want all your vb projects with "strict off" and tend to set "strict on" whenever possible).

I've used a .net language with non nil reference semantics (cobra language) and I must say that I liked it a lot, used properly, it provides a zero cost abstraction compared to usage of option type.

Now, I'm not saying that F# should go that way, but it is good to consider the fact that it is coming to C# as maybe more than "just" an interop concern and potential positive aspects if F# were to also introduce that specific compiler flag in future.

As of now, I'm not clear if the shorthand syntax is a must have for this feature in terms of having things ready for most basic interop, I'd consider more important to establish clearer picture of aim of this support, and @dsyme's point 2 is spot on.

I'd add to that point 2 that having F# types flow to more general .NET the same nil/nonnil should also be a goal, I don't think it breaks the ethos of how C# is going to handle this on their end (no runtime behaviour, just sane warnings to find potential bugs and be more explicit about intent).

@0x53A in cobra you can do equivalent of this:

let f (nillable: string?) =
   let nonNillableString = nillableString to !
   // ...

that to ! would introduce the runtime check, and reads like a cast (done via to) in that language, I guess in F# we may consider :!> to do that along with a fsharp.core function akin to box / unbox.

Looking forward to ideas and proposals in this thread 😃

cartermp commented 6 years ago

@dsyme Regarding principles:

F# should remain as simple to use as it is today

There's only so far as this can go. The primary platform that F# runs on is moving in this direction, and it will only be a matter of time before reference types are considered non-null by default.

I would actually argue that this is easier to deal with conceptually, because the implicit null carried with any reference type whenever interacting with non-F# code is far too easy to forget and be bitten by later (see this issue where we had a crashing Annotation View in VS due to forgetting to account for null in otherwise normal F# code).

Unfortunately, it's still a bit more complicated that just this, because if you ignore all the warnings (we're going warning-level, not error-level, that's something I will pick my hill to die on) you can still crash. But at least it's pointing out that the world is a bit more complicated rather than hiding that from you only to make you deal with a NullReferenceException later.

Thus, I think this principle is fine, but we're shifting the goal posts a bit.

The value for F# here is primarily in flowing non-nullable annotations into F# code from .NET libraries, and vice-versa, and in allowing the F# programmer to be explicit about the non-nullability of .NET types.

Agreed.

Adding nullability/non-nullability annotations should not be part of routine F# programming (see principle 1)

I agree, but with this caveat. The concept of a reference type in .NET is changing. Yes, at the IL level everything is still basically the same, but this concept is going to flow directly into CoreFX and the BCL. It's a reality to deal with.

Now I will say that given the nature of non-null by default for F# constructs, there will be considerably less annotations in F# code than in C# code. But at the boundary layers of F# code that must interact with C# and .NET, I would expect annotations to come into play.

Nullability annotations/information should not appear in types shown in output in routine F# programming (see principle 1)

I would definitely expect to see nullability annotations in type signatures that appear in tools and in FSI. These should reflect the source code in the text editor. Otherwise, I think I agree, but I'm not sure what you mean.

Are you referring to this, or something else?

F# users should primarily only experience/see this feature when interoperating with .NET libraries (the latter should be rarely needed)

Yes. I think that this doesn't change the way people would write F# "kernel" code in a system today; that is, using DUs and Records to hold data.

The feature is backwards compatible, all existing F# projects compile without warning by default

Agreed. Wherever this lands (say F# 4.6), the default for new projects is to have this be on, and existing projects build with an older F# version will have it be off. I would assume that we adopt the same rules that C# does:

From The C# proposal: In summary, you need to be able to opt in/out of:

  • Nullable warnings
  • Non-null warnings
  • Warnings from annotations in other files

Anything less on our end would be untenable.

The feature should produce warnings only, not hard-errors

Yes. I would refuse to have this implemented if it were enforced at an error level.

Regarding your example on interfaces:

Are those non-nullable strings or not?

My understanding is that if the library is compiled with C# 8, then yes these are non-nullable strings. That is, a string and an assembly-level attribute is emitted by the compiler. Languages that respect this attribute would then be responsible for interpreting the string types as non-nullable. But languages that do not respect the attribute will just see nullable strings.

In the interface implementation examples you give, yes this would be a warning, and I think it should be on by default for new F# projects, just like it would for C#.

Regarding flow analysis:

C# will "give up" in some places. @jaredpar could comment on some examples.

I agree that there will be some pressure to try and add more as annoying little edge cases come up. But this is something we can do progressively.

Regarding the tension between this and F# options, and having one way to do things

I think the ship has already sailed on this front. We already have "typesafe null" with options and struct options, "implicit null" with reference types, null itself, and Nullable<'T> from the BCL. You could also argue that Choice and Result are somewhat similar in nature.

What this does is it changes reference types from "implicitly null" to "says if it can be null or not".

The right way to show this in syntax and how to rationalize its usage when compared with Option could be tricky, as there seems to be a longstanding idiom of the following:

let tryGetValue (s: string) =
    match s with
    | null -> None
    | _ -> Some s

I would argue that this is inferior to nullability annotations, because we're conflating null with "conceptually nothing", when null is a specialized consideration when programming for .NET. Making that leap as a new F# programmer isn't too difficult, but it's still a bit to learn, and I remember thinking "this sort of sucks" when learning. Now in F#-only code, options are wonderful, and when learning F# it became quite clear that I could evict null from my worries so long as I was behind the boundaries of my system.

To that end, I'm not convinced that there's too much tension here, certainly not if this is documented well.

cartermp commented 6 years ago

Another consideration, perhaps even a principle, is that this feature would also need to work well with Fable. I believe this is not particularly difficult, as TypeScript's nullable types offer a view into how that works for web programming, and nullable types as-proposed in C# are modeled after them a bit, especially the opt-in nature of them.

Speaking of how things are done in TypeScript, this is the syntax:

function f(s: string | null): string { ... }

Guarding against null with a union is fantastic syntax.

Note that they use ! at the end of an identifier to say, "ignore the fact that this may be null". I think a similar mechanism is a decent idea.

cartermp commented 6 years ago

Just to summarize how some other languages are approaching this:

Swift - Optional

Optionals in Swift are variables that can be nil. These are not the same as F# options.

There are two forms of their syntax:

func short(item: Int?) { ... }
func long(item: Optional<Int>) { ... }

To use the underlying value, you must use any of the following syntax:

Using control structures to conditionally bind the wrapped value:

let x: Int? = 42

// 'if let', but could by 'guard let' or 'switch'
if let item = x {
    print("Got it!")
} else {
    print("Didn't get it!")
}

Using ?. to safely dot into an item:

if myDictionary["key"]?.MyProperty == valueToCompare {
    ...
} else {
    ...
}

Using ?? to provide a default value if the item is nil:

let item = myDictionary["key"] ?? 42

Unconditional unwrapping (unsafe):

let number = Int("42")!
let item = myDictionary["key"]!MyProperty

Additionally, later versions of Swift/Obj-C added nullability annotations, because anything coming from Obj-C would automatically be annotated as identifier!; i.e., an implicitly unwrapped optional. Luckily, with C# 8, it doesn't appear that this problem is applicable.

Overall, this kludges together nullability and optional as we know it in F#.

Kotlin - Null Safety

In Kotlin, the type system distinguishes references that can be null and those that cannot. These are denoted with ?:

var a: String = "hello"
a = null // Compile error

var b: String? = "hello"
b = null // OK

Accessing a value from a type that is nullable is a compiler error:

val s: String? = "hello"
s.length // Error

Through flow analysis, you can access values from nullable types once you've checked that it's non-null:

val s: String? = "hello"
if (s != null && b.length > 0) { // Note that this is legal after the null check
    print("It's this long: ${b.length}
}

However, flow analysis does not account for any arbitrary function that could determine nullability:

fun isStrNonNull(str: String?): Boolean {
    return str != null
}

fun doStuff(str: String) {
    println(str)
}

fun main(args: Array<String>) {
    val str: String? = "hello"

    if (isStrNonNull(str)) {
        doStuff(str) // ERROR: Inferred type is String? but String was expected
    } else {
        println("womp womp")
    }
}

You can also use ?. to access values from a potentially null reference:

val l = str.?length // Type is Int?, returns 'null' if 'str' is null

You can also use the elvis operator ?: to conditionally assign another value if the reference is null and you don't want the resulting type to also be nullable:

val l = null?.length ?: -1 // Type is Int

You can also use !! to assert that a reference is non-null to access things in an unsafe manner:

val l = str!!.length

This will throw a NullPointerException if str is null.

You can also safely case a nulable reference to avoid a ClassCastException:

val x: Int? = a as? Int

Here, x is null if the cast is unsuccessful.

This is very close to how C# is approaching the problem, with the major difference being that Kotlin emits errors instead of warnings.

TypeScript - Nullable Types

In TypeScript, there are two sepcial types: null and undefined. By default, they are assignable to anything. However, when compiled with --strictNullChecks, TypeScript code will disallow assignment like that:

let s = "hello";
s = null; // ERROR, 'null' is not assignable to 'string'
let sn: string | null = "hello";
sn = null; // OK

sn = undefined; // ERROR, 'undefined' is not assignable to 'string | null'

let snu: string | null | undefined = "hello";
sn = undefined // OK

The idea that a type could be null is handled by a union type. When compiling with this flag, programmers must specify types like this.

To use the value, you need to guard against null:

function yeet(sn: string | null) {
    if (sn == null) {
        ... // handle it
    } else {
        ... // Use 'sn' as if it were a string here
    }
}

However, like in Kotlin, the compiler can't track all possible ways something could or could not be null.

To handle this, you can assert that something is non-null with the ! operator:

function doot(name: string | null) {
    return name!.charAt(0)
}

Because this is an assertion, the programmer is responsible for ensuring the value is non-null. If it is null, fun behavior will occur at runtime.

My opinion

TypeScript handles this the most elegantly, and with all other things held constant, I'd prefer it be done like this in F#. It doesn't feel like Yet Another Special Case to account for, but rather an extension of an already-existing feature, and feels a bit more "baked in" than how other languages handle it.

However, at the minimum, we'd need the ability to specify ad-hoc unions (#538), which means the scope of the feature would be larger. And that doesn't get into the other implications of a syntax like this.

cartermp commented 6 years ago

Another thought: having this be configurable to also be an error independently of making all warnings errors, along the lines of what @0x53A was saying. I could imagine that F# users would want this to be a hard error in their codebases, but not have all warnings also be that way.

alfonsogarciacaro commented 6 years ago

This seems it could be a very nice feature for Fable users too. Just for the record, I'll leave some comments about dealing with null when interacting with JS from F#/Fable:

Another reason to erase options in Fable was to use them to represent TypeScript optional properties. Against, I don't think this would be any redundancy, and we'll probably still use options for this, because missing properties eval to undefined.

TBH, I haven't been involved very much in latest ts2fable development so I'm not sure if it's doing something to represent TS nullable types in F#.

In summary, I'm assuming that anything you do to make interop with C# safer it'll be probably good for JS interop too. Fable interoperability is usually compared with Elm, which uses something called "channels" to make sure you won't get any null-reference error at runtime. Fable makes interop easier and more direct, but less safe in this sense. Having null-checks enforced by the compiler is likely to be very welcome by users who prefer to make the interop safer.

dsyme commented 6 years ago

A basic shift of the platform to non-null-reference-types-as-the-default can, I think, only be good for F#.

I guess I'll admit I'm still sceptical C# and the platform will shift that much. The more it shifts the better of course. However I have used C# lately to write the first version of a code generator and the pull towards using nulls is very, very strong in hundreds of little ways. For example, any use of Json.NET to de-serialize data immediately infects your code with nulls. While this is also true when using Json.NET with F#, there are other options available in the F# community for JSON deserialization which give you strong and much more trustworthy non-nullness. So I'm sceptical C#'s version of non-nullness is going to be "trustworthy", even in the sense that the typical C# programmers feels they can trust the feature in their own code, let alone other people's code. The whole C# language and ecosystem is geared towards making "null" very pervasive. I'm also a little sceptical we will see that many non-nullness annotations appearing in framework libraries, but let's see what they come up with...

Anyway, I suggest this principle:

We don't undermine the "strong" notion of non-nullness in the F# world by somehow overly embracing a "weak" notion of non-nullness flowing in from C#

BTW some examples of where the complexity added by this feature will be in-the-face of users to a worrying degree:

Agreed....the default for new projects is to have this be on, and existing projects build with an older F# version will have it be off.

Note that F# Interactive scripts will now give warnings (because they have no mechanism to opt-in to particular language features or language version level).

dsyme commented 6 years ago

@cartermp I'll make a catalog of proposed design principles in the issue description

0x53A commented 6 years ago

Note that F# Interactive scripts will now give warnings (because they have no mechanism to opt-in to particular language features or language version level).

something similar to #light "off"?

dsyme commented 6 years ago

Nullability annotations/information should not appear in types shown in output in routine F# programming (see principle 1)

I would definitely expect to see nullability annotations in type signatures that appear in tools and in FSI. These should reflect the source code in the text editor. Otherwise, I think I agree, but I'm not sure what you mean. Are you referring to this, or something else?

I think I need to get a feeling for how often string | null (or string? if that's the syntax) would appear in regular F# coding.

For example, the tooltips for LINQ methods are already massive and horrific. It is really, really problematic if we start showing

member Join : source2: (Expression<seq<string | null> | null> | null) * key1:(Expression<Func<(string | null), T | null> | null> | null)) * key2:(Expression<Func<(string | null), T | null> | null> | null)) 

instead of the already-challenging

member Join : source2: Expression<seq<string>> * key1:(Expression<Func<string, T>>)  * key2:(Expression<Func<string, T>>)

just because there are no non-nullness annotations in the C# code. We have to be careful not to blow things up.

We have quite a lot of flexibility available to us in when and how we choose to surface annotation information. For example, in tooltips we already show with operator (+) instead off the full gory details of static member constraints. This is much more usable and has definitely been the right choice.

Anyway we can talk through the details later, however looking at the above example I'm certain we will need to be suppressing nullness annotations in many situations flowing from .NET in some situations.

dsyme commented 6 years ago

something similar to #light "off"?

Yes, scripts should probably really be able to have all of this sort of thing:

#langlevel "5.0"
#define "DEBUG"
#options "/optimize-"
cartermp commented 6 years ago

@dsyme With respect to tooling, I'd rather we plan to show annotations (and build them out as such), and then find ways to reduce them down based on feedback. That is, I'd rather start from a place of showing all information and then reducing down to an understandable notation rather than start from a place of eliding information.

dsyme commented 6 years ago

That is, I'd rather start from a place of showing all information and then reducing down to an understandable notation rather than start from a place of eliding information.

The example above is easily enough to show that we need to elide (and there are much, much worse examples - honestly, under default settings some method signatures would run to a page long). There's absolutely no about that if we are remotely serious about design principle (1) "don't lose simplicity", and we've already walked this territory before. So we can take that as the first round of feedback I think, given I'm a user of the language as well :)

Really, I'm going to be very insistent about taking design principle (1) seriously. Simplicity is a feature, not something you tack on afterwards.

cartermp commented 6 years ago

In that example I would not expect that signature to change, since the default for all reference types in this new world is non-nullable. That is, any existing reference type will not have a changed signature unless we modify it to be as such. AFAIK, this is not the plan for the BCL (except in the small number of cases where it explicitly returns or accepts null), and it shouldn't be for FSharp.Core either.

I agree that signatures in any context could get out of control, and I could see something like this be possible in tooltips to at least help with that:

member Foo : source: (seq<'T>) * predicate: ('T -> 'U-> bool) * item: ('U)

    where 'T, 'U are nullable

Generic parameters:
'T is BlahdeeBlah

Not sure how things in signature files would be made better. That said, I also think there's a slight positive in having it be more tedious and difficult to represent nullable data to reinforce that this is not the default way anymore.

dsyme commented 6 years ago

In that example I would expect that signature to not change, since the default for all reference types in this new world is non-nullable. That is, any existing reference type will not have a changed signature unless we modify it to be as such. AFAIK, this is not the plan for the BCL, and it shouldn't be for FSharp.Core either.

How can that be the case? IQueryable.Join is defined in a .NET assembly. All we see is the .NET metadata, a IL type signature. There is no annotation in that .NET metadata if, for example, targeting .NET 4.7.2 or .NET Core 2.1 or .NET Standard 2.0, and we surely have to assume nullability for both inputs and outputs if there is no annotation - half the Framework accepts null as a valid (and in some cases necessary) value..

I assume some future version of .NET Standard will come with a much more constrained IQueryable.Join, with lots of useful non-nullable annotations, which is fine. Or equivalently there may be an upcoming version with a big fat annotation saying "hey, it's ok to assume non-nullable for this DLL unless we say otherwise!". But even then I suspect an awful lot of methods in framework functionality will list null as a valid input.

Basically for every single reference or variable type X flowing in from .NET assemblies we will surely have to assume "X | null" in the absence of other information. We can't assume "non-nullable" unless there's information saying that's a reasonable assumption. Whether we choose to turn that into warnings and/or display the information is another matter. It could be that for entirely unannotated .NET 4.x/.NET Standard 2.x/.NET Core/C# 7.0 assemblies we do more suppression than for annotated assemblies.

I could see something like this be possible in tooltips to at least help with that: member Foo : source: (seq<'T>) predicate: ('T -> 'U-> bool) item: ('U) where 'T, 'U are nullable

Yes, something like that. Other tricks are

Not sure how things in signature files would be made better.

These would have to be explicit

That said, I also think there's a slight positive in having it be more tedious and difficult to represent nullable data to reinforce that this is not the default way anymore.

Yes, and I'm happier with string | null over string? for this reason

smoothdeveloper commented 6 years ago

How can that be the case? IQueryable.Join is defined in a .NET assembly. All we see is the .NET metadata, a IL type signature.

Maybe the C# implementation will add an assembly level attribute describing if the assembly is compiled with non null ref enabled. If this is the case, presence of this assembly attribute would turn on the extended signature logic.

cartermp commented 6 years ago

Re: BCL

After talking with @terrajobst, the plan is to do something where types are annotated as nullable if the input type is valid or the return type is null. However, there are numerous places where they just throw an exception if an input is null, or they never return null explicitly. Thus, the current plan does not involve blanket annotations for everything.

I'm not sure of the vehicle, but because it's unlikely that we'll be updating older .NET Framework versions, I suspect that some sort of shim will be involved.

But I think the overall goal is to make this feature also "work" on the BCL as much as they can, since it's arguably the most valuable place to have it "enabled". And since you can target older BCL versions with C# 8.0, this would have to come in some way.

cartermp commented 6 years ago

@dsyme Also just to confirm, if we were to try to adopt a TypeScript-like syntax, would that mean that #538 is also a prerequisite to this feature, or would this just be a specialized thing that we expand later?

cartermp commented 6 years ago

More considerations (some from the C# proposal):

Examples:

// ! postfix to assert nullability
let len (s: string | null) =
    if not(String.IsNullOrWhiteSpace s) then
        s!.Length
    else -1

// !! postfix to assert nullability
let len (s: string | null) =
    if not(String.IsNullOrWhiteSpace s) then
        s!!.Length
    else -1

// "member" access, a la java and Scala
let len (s: string | null) =
    if not(String.IsNullOrWhiteSpace s) then
        s.get().Length
    else -1

// Cast required
let len (s: string | null) =
    if not(String.IsNullOrWhiteSpace s) then
        (s :> string).Length
    else -1

Living with [<AllowNullLiteral>] and null type constraint:

Do we want to "alias" the types/type signatures? E.g.

[<AllowNullLiteral>]
type C< ^T when ^T: null>() = class end

Is now this in tooltips:

type C<'T | null> =
    new: unit -> C<'T | null> | null

// As opposed to the current style
// Note that [<AllowNullLiteral>] doesn't surface here
type C<'T (requires 'T: null)> =
    new: unit -> C<'T>

And when instantiated:

let cNull = null
let c = C<string | null>()

Shows:

val cNull: 'a | null
val c : C<string | null>

// As opposed to the current style
// Note that the type constraint is not shown

val cNull : 'a (requires 'a: null)
val c : C<string>

Or find a way to make the nullable syntax be aliased by existing things today, i.e., do the reverse?

My current opinion

jaredpar commented 6 years ago

@smoothdeveloper

Maybe the C# implementation will add an assembly level attribute describing if the assembly is compiled with non null ref enabled.

There will be a way to detect via metadata that an assembly was compiled with C#'s nullable reference type feature enabled.

cartermp commented 6 years ago

Oooof, weirdness with the null constraint. So, given the following:

[<AllowNullLiteral>]
type C() = class end
type D<'T when 'T: null>() = class end

If C were not decorated, it would be a compile error to parameterize D with it. That would make you think that the null constraint on D could be made equivalent to 'T | null, but the compiled output says otherwise:

    [Serializable]
    [CompilationMapping(SourceConstructFlags.ObjectType)]
    public class D<T> where T : class
    {
        public D()
            : this()
        {
        }
    }

As you'd expect, you can parameterize D with any reference type today (unless it is declared in F# and is not decorated with [<AllowNullLiteral>]). That means that in a C# 8.0+ world, both nullable and non-nullable reference types could be used to parameterize D.

However, as per the C# proposal:

The class constraint is non-null. We can consider whether class? should be a valid nullable constraint denoting "nullable reference type".

Ooof - it's a bit mind-warping that in F# syntax we declare something as effectively nullable, but it emits as something that is non-null. If class? were done for C# 8.0, then we'd need to either change how the null constraint in F# compiles, or have it be a specialized F# thing that remains for back compat and have people move towards something like this:

type D<'T | null>() = class end

I don't really see a way forward with this unless we change existing F# behavior somehow. What I'd like to have happen is the following:

However, that loosens requirements around null when compared with F# code today, which doesn't feel right. Alternatively:

I'm already dying on the inside at thinking about how you can use a "null constraint" or a "nullable constraint", though 😢.

cartermp commented 6 years ago

Note that in C# 8.0, default(string) returns a string? today, so I would expect that we do a similar thing for F#, or emit a warning when Unchecked.defaultof<'T> is called period, since this is something that only ever will give a null for a reference type.

cartermp commented 6 years ago

@dsyme and other folks, open RFC is here:

https://github.com/cartermp/fslang-design/blob/nullable/RFCs/FS-1060-nullable-reference-types.md

cartermp commented 6 years ago

Some notes after a chat with @madstorgersen:

zpodlovics commented 6 years ago

Learn from other implementation mistakes (especially for functional code):

"The final irony is that by attempting to discourage nulls, the authors of this class have actually encouraged its use. I'm sure there are a few who will be tempted to simply return null from their functions in order to "avoid creating an unnecessary and expensive Optional reference", rather than using the correct types and combinators."

https://developer.atlassian.com/blog/2015/08/optional-broken/

cartermp commented 6 years ago

@zpodlovics I'm afraid I don't understand the relevance of Java optionals (an unrelated type that is not nullability) to the extensive language work that goes into nullable reference types.

zpodlovics commented 6 years ago

@cartermp Surry for the noise, just wanted to make sure no holes remain in the theory of operation. It has little (~epsilon) relevance - their implementation prevented null usage, and the result was broken for their functional code (List map) because null is a perfectly legitimate reference value there. My concern is that partial solution (earlier C# example: C# compiler sometimes just give up) or enforcement could result surpringly different program executions (similar to the earlier java example).

cartermp commented 6 years ago

Right, well this is also why an attribute to declare something as "handling null" is being proposed and will be respected. Otherwise there are numerous situations that are undecidable and would thus require the use of a non-nullable assertion in code. If you read the RFC you'll find that I've covered this concern and where we can do work in the compiler to ensure that these are an abnormal occurrence in code, not something you need to pepper throughout your source code.

dsyme commented 6 years ago

@dsyme Also just to confirm, if we were to try to adopt a TypeScript-like syntax, would that mean that #538 is also a prerequisite to this feature?

No, it's just syntax, T | null would be separate to #538 but there may be some reuse of syntax machinery.

Introduce ! to assert non-nullability, since C#/TypeScript/Swift already do this, so it's "familiar in that sense

As a predisposition, I'm extremely unwilling to pollute the core dot notation as part of this or any feature. See principle (1) above. Perhaps I'll soften over time, it's just a first "don't you dare do that to my language" reaction :)

Note that from the F# perspective we may not mind if eliminating a possibly-null is a little more syntactically costly, because people largely only need to do it in interop. So initially I would expect the primary form of eliminating a possibly-null condition would be a pattern match match s with null -> ... | s -> ..., then we'd gradually consider less verbose forms, e.g. assertion line nonnull s and so on. Going all the way to s!.P might be necessary but it makes my stomach churn to think of either C# or F# littered with that. And it really depends how many possibly-nulls flow into F# coding from C#-libraries-compiled-with-the-feature.

Also, the identifier ! is massively used in F# already (let! and !x) and my default preference would be not to use it at all as part of this feature (though I'll consider it - just giving predispositions here).

Note that in C# 8.0, default(string) returns a string?, so I would expect that we do a similar thing for F#

We're not copying the C# design here :) (though it's highly relevant). We think everything through from an F# perspective.

In this case we don't have default(..) in F# - we have Unchecked.defaultof<_> in the F# library, which is a different thing because it explicitly says Unchecked. At least some people will consider it weird if we start giving warnings on the use of something that already uses the magic word Unchecked.... Of course using null is different we would expect warnings on that.

dsyme commented 6 years ago

Will take a look at the RFC draft, likely later in the week, thanks!

isaacabraham commented 6 years ago

Perhaps a foolish question here, but is there going to be no way to implicitly go from this "nullable ref value" to "option value" e.g. given a string? in C# / BCL, could that not just map to a string option in F#?

I get that this would effectively be "wrapping" every time e.g. given an optional string in C#, "test", into Some "test" but in effect isn't that the whole point in F# i.e. to use explicit types to help us out?

I'm just super worried that we're going to end up with three ways to represent nullability:

From a consistency point of view, I would sooner just see this mapped into option for F# users - it's a well-understood paradigm, there's already documentation around it all etc. etc.. I don't know that's possible / realistic / desirable though.

jwosty commented 6 years ago

@isaacabraham Assuming I'm not misunderstanding your question, the problem with that is that everything can potentially be null and therefore would implicitly be an Option value. Meaning, this C# code, which just returns a normal .NET string (and not a non-nullable string):

public string GetName() => "Joe Smith"

Would now look like this from F#:

let x: string option = CSharpClass.GetName()

It would so much better if .NET had been designed without implicit nulls in the first place. But hey, it's the environment we're forced to deal with.

However if you're just talking about explicit Nullable values, there's always Option.fromNullable. Although it'd be interesting to see if the compiler could implicitly treat Nullable as Option.

isaacabraham commented 6 years ago

@jwosty yep, that's right. I was thinking more of having implicitly nullable values from "old" C#, but explicitly non-nullable OR option from "new" C#:

public string GetName() => "Joe Smith"

In a .NET assembly compiled with C#7.x, this would render as an implicitly nullable string in F#, the same as today. HOWEVER, if this was compiled using C#8, it would render as a non-nullable string. The F# compiler would prevent null comparisons, null assignments etc. just like with F# Records, DUs and Tuples.

public string? GetName() => "Joe Smith"

(or whatever C#8 optional syntax is).

In this case, this would render as a string option in F#.

cartermp commented 6 years ago

@isaacabraham Unfortunately, that would be a massively breaking change:

Breaking code where we handle null today

Consider some F# code in the VS tooling

Until the VSSDK is recompiled with C# 8.0 and all reference types it returns are non-nullable, then every reference type must still be assumed to be nullable, because the VSSDK absolutely does return null as a value for things.

I count at least 10 values that are reference types today that would convert into F# option types, thus 10+ more pattern matches are required, which would then probably necessitate a maybe CE just to keep things sane.

Additionally, some of the null checks would have to be removed, since they are now treated as if they are a 'T option, and F# options do not have null as a proper value.

These two changes would require the entire method to be rewritten rather than selectively modified.

Breaking code where we emit an option

Consider the following public API:

type C() =
    member __.M(doot) = if doot then Some "doot" else None

Today, this emits as:

[assembly: FSharpInterfaceDataVersion(2, 0, 0)]
[assembly: AssemblyVersion("0.0.0.0")]
[CompilationMapping(SourceConstructFlags.Module)]
public static class _
{
    [Serializable]
    [CompilationMapping(SourceConstructFlags.ObjectType)]
    public class C
    {
        public C()
        {
            ((object)this)..ctor();
        }

        public FSharpOption<string> M(bool doot)
        {
            if (doot)
            {
                return FSharpOption<string>.Some("doot");
            }
            return null;
        }
    }
}

C# code today must check for null, which is in line with this being equivalent to a nullable reference type. However, C# code today sees the result as an FSharpOption<string>, which is very much different from string. So to be consistent, the C# compiler would have to now interpret FSharpOption<string> as if it were a string (note: non-nullable string), and the type of M as a string?. This would be a breaking change for C# code that upgrades to C# 8.0 and turns on the nullability feature.

zpodlovics commented 6 years ago

@cartermp This is exactly what I am talked about here: https://github.com/fsharp/fslang-suggestions/issues/577#issuecomment-402226360

null is a prefectly legal habitat for nullable types and should be allowed as an optional type value, otherwise the same mistake will be made as java did...

zpodlovics commented 6 years ago

@isaacabraham Why a value type Nullable<'T> should be replaced with a heap allocated reference type in F#? Why would anybody want to entertain the garbage collector? It's far better to leave as-is and introduce the nullable concept in F# independently from optional (option and valueoption types) otherwise the interop story will be non-existant.

How the different combinations supposed to work? Eg.: optional bool, optional nullable bool, nullable optional bool, valueoptional bool, valueoptional nullable bool, nullable valueoptional bool?

isaacabraham commented 6 years ago

@zpodlovics I never suggested changing Nullable to Option - please read what I said again (besides, there is already the possibility to create a struct version of Option which would alleviate the concern you have (and it'll be included in FSharp.Core at some point)). I simply commented that adding a third type of "nullable" data makes things more complicated, and suggested that there might be a way to avoid that and to simply stick with Nullable and Option.

Regarding null though - no, we don't want to keep null in as some optional value type. It's a mistake to have ever had it implicitly allowed into all reference types in .NET, and the sooner we rid ourselves of it the better off we'll be.

@cartermp In my fantasy world of F#5 (or whatever) - C#7-based code would render as today i.e. a nullable string. At some point in the future, that code that F# references would be upgraded, and either the data would render as string (in which case a 1:1 mapping with what you have now except the value can't be null), or string option because the data is nullable (and therefore should be matched over). In some ways this is not so different from how Kotlin handles things (I believe):

Going the other way might be more problematic though!

At the least, we should have an easy way to go from whatever we call these types into Options (and back out again) - the same way Nullables can be treated as Options - so we can have a unified model for dealing with nullability.

jwosty commented 6 years ago

@isaacabraham In a perfect world, yes, that would be awesome. .NET ideally never should have had implicit nulls, but it can never be undone without majorly breaking changes, even from the language level. Old F# code that compiled would suddenly not compile anymore if referencing C# code compiled with the old compiler. E.g., take this combination which currently compiles:

// C# 7
public class CSharpClass {
    public static string GetString() => "foobar";
}
// F# 4.1
printfn "%s" (CSharpClass.GetString())

If, say, F# 5 interpreted types that aren't marked CLR non-nullable as Option (or Nullable), upgrading just the F# snippet would break this code. You'd have to recompile the C# with the new compiler. I don't think it's possible to do this in a non-breaking way unfortunately.

isaacabraham commented 6 years ago

@jwosty taken from above:

Data which we don't know about (C#7 and below -> maps to a nullable string, exactly what we have today)

In other words, in the absence of any "nullable" marker / attribute that would be emitted by the C# compiler, we would treat things exactly as today. There would be no change when working with code that was compiled with C#7.

Only when that C# assembly was rebuilt with C#8 would things change from the F#5 perspective:

  1. Either the value was marked as non-nullable, in which case F# would prevent us checking against / assigning to null.
  2. Or the value was marked as nullable, in which case F# would surface it as an option.
zpodlovics commented 6 years ago

@isaacabraham I may misunderstood what you said - whenever I saw Option type I always associate to the F# option type and not both F# option and F# valueoption/voption type (which is not yet in the core).

Let assume I have a array of nullable int32. What type the List.tryFind should return? Let assume I have a list of nullable string. What type the List.tryFind should return? null is a prefectly legal value for for option value here. What will be the difference between having (finding) a null value in the array and not founding with tryFind? How the functional operation eg.: map supposed work if not even possible to represent every type habitant? Operation that supposed to have the same number of elements suddenly returns different number of elements?

However a list of non-nullable int32 and a list of non-nullable string does not have null as a habitant, so non-nullable type option type should not allowed to have null as a value. By definition option type must able to represent every type habitant as value.

Merging different concepts because it looks somewhat familiar, and it's easy to do will make thing even more complicated.

zpodlovics commented 6 years ago

Update: It looks like I missed the memo about the nullable value types implementation. However I still feel a bit surprised with the implemented behaviour.

The following examples was tried with 5.12.0.301-0xamarin8+ubuntu1604b1 and 4.1.33-0xamarin7+ubuntu1604b1:

type [<Struct>] S<'T> =
  val Value: 'T
  new(v) = { Value = v}

let a = Array.zeroCreate<S<System.Nullable<System.Int32>>> 10
let found1 = Array.tryFind (fun i -> i = S<_>(System.Nullable<System.Int32> 10) ) a
let found2 = Array.tryFind (fun i -> i = Unchecked.defaultof<S<System.Nullable<System.Int32>>>) a

What will be the difference between not founding with tryFind and having (finding) a null value in the array?

val found1 : S<System.Nullable<System.Int32>> option = None
val found2 : S<System.Nullable<System.Int32>> option = None

This is probably the best comment I have seen in the review:

"Design a fully sound (apart from runtime casts/reflection, akin to F# units of measure) and fully checked non-nullness system for F# (which gives errors rather than warnings), then interoperate with C# (giving warnings around the edges and for interop)"

cartermp commented 6 years ago

@zpodlovics I'm not sure I follow.

zpodlovics commented 6 years ago

@cartermp Thanks for the update. It looks like I missed the memo about the nullable value types implementation. The default behaviour was too surprising to me so I assumed its not yet or not fully implemented. Later I found the custom operators and the test cases...

matthid commented 6 years ago

I just read through the suggestion and the discussion here and I feel like nobody is really too happy with any solution. Therefore let me just try to suggest a bit of craziness (probably too hard to implement but maybe some food for the discussion).

What if we don't have a "real" option type but we make the feature "look" like it is similar to an option:

type Optionish<'a> =
   | null
   | Ref of a

Note that this Optionish type will not exist anywhere and there is no way to define it yourself.

I'm not sure about the name (we probably would choose another one), but I guess it has to be different from Option. It would behave basically very similar to what @cartermp specified as 'a | null, so you can directly use 'a members on variables and use it in places where 'a is expected (with the new compiler warning), but you can use it in pattern matching and you can use it in a way that feels more natural (and without warnings):

match v with
| null -> ...
| Ref i -> ...

This basically should "feel" like using Option.

Without thinking this through too much, but we might be able to somehow overload Some and None in pattern matching on Optionish types, but I guess that will make type inference a bit more difficult.

It basically is quite similar to string | null but it hopefully feels a lot less "hacked" in.

Could this work out or do I miss something here?

Edit: It's basically a "unsafe" version of "option" where you can do everything you want and only get warnings instead of errors and the feature would disappear on compilation (besides the attributes).

cartermp commented 6 years ago

@matthid one of the biggest reasons for this feature is to warn on existing code that isn't null safe today.

For example:

let len (s: string) = s.Length

...

let value = getStringValueSomewhere() // Could by null

printfn "%d" (len value)

As you're well aware (and may have been bitten by in the past!), this code could easily throw an NRE.

With the feature as-proposed in the RFC, len value would emit a warning because value is possibly null, which I believe we can agree is a useful thing.

For this to also happen with Optionish<'T> (love the name 😄), would this be implicitly convertible between reference types? Would it be possible to realize this with the following null-safe code?

let xn = SomeNullableReferenceType()

match xn with
| null -> // handle null
| x -> x.Something()

i.e., can I use the same syntax throughout, implicitly convert, or something else?

FWIW I would much rather find a way to make things be based on Option or option-ish things, but I haven't thought of a way for that to work yet.