golang / go

The Go programming language
https://go.dev
BSD 3-Clause "New" or "Revised" License
122.72k stars 17.5k forks source link

proposal: Go 2: nillability annotations #49202

Closed YairHalberstadt closed 2 years ago

YairHalberstadt commented 2 years ago

Background

By far the most common cause for panics (for me at least) is nil pointer dereferences - the billion dollar mistake.

This problem isn't unique to golang, but a number of languages are designed from the ground up to avoid it - FP languages for example, as well as rust. They use discriminated unions to mark which objects can be nil, and which can't. However these solutions are difficult to retrofit on a language that's more than a decade old.

C# took a different approach - using annotations to mark which references could be null, and warning if you ever dereferenced a nullable reference. Flow analysis could promote a nullable reference to a non-nullable one. See here for more details.

This approach was an enormous amount of work for the C# team, but extraordinarily successful. Whilst there are a long tail of cases where the flow analysis isn't quite sophisticated enough to realize a reference can't be null, or flow analysis holes where the compiler thinks a null reference is non-null, overall NullReferenceExceptions have basically become practically non-existent.

It's also had another useful advantage - developers can now safely use null as a marker type, without having to document first that the method can return null, and hoping that the developer actually reads the docs.

I think it would be valuable for go to consider a similar approach.

Example

I would suggest a ? prefix before a type to indicate that the type allows nil as a valid value.

var str1 ?*string
_ = str1[0] //warning - nillable pointer

if someCondition {
    str1 = &"hello world"
    _ = str1[0] // no warning - str1 is not nil here
}

var str2 *string = str1 // warning - cannot assign nillable pointer to not nil pointer

if str1 != nil {
    str2 = str1 // no warning - str1 is not nil here
}

Syntax

Type      = TypeName | TypeLit | "(" Type ")" | "?" Type.
TypeName  = identifier | QualifiedIdent .
TypeLit   = ArrayType | StructType | PointerType | FunctionType | InterfaceType |
        SliceType | MapType | ChannelType .

It would be illegal for a type to have multiple "?" annotations, but the subparts might all be annotated separately. E.g. ?*?map[?interface{}]?func(). This would be a nillable pointer to a nillable map from nillable interfaces to nillable funcs.

Semantics

Flow analysis is used to determine at all points whether a storage location is nillable or not. Nillability is updated when a location is assigned to, or when a location is checked to see if it's nil.

A warning will occur when a storage location which is currently nillable is

  1. dereferenced
  2. assigned to a variable marked as not nil
  3. passed as a parameter to a func if the parameter is marked as not nil
  4. appended to a slice where the slice element type is not nil
  5. used as key/value in a map where the key type/value type is marked as not nil
  6. used as a receiver of a func if the reciever is marked as not nil etc.

It will be possible to suppress such warnings via some syntax, possibly by appending ! to the expression like C#.

When initializing a struct it would be required to initialize all non-nillable fields. For example:

type MyStruct struct {
    *string Field1
    ?*string Field2
    string Field3
}

_ = MyStruct{} // warning: Field1 is nil
str := "Hello World"
_ = MyStruct{Field1: &str1} // no warning

Slices

Slices made with make appear to have a type hole:

slice := make([]*string, 5)
_ = *slice[0] // panic: nil pointer dereference

As a result it should be illegal to use make to create a slice with non-0 length of a non-nillable type. This should be perfectly acceptable as there is a trivial workaround - set the capacity to the desired total size, and then append the elements one by one.

Error Handling

When a method returns a non-nil error, any other return values which are usually non-nillable would not be required to be nil.

As a result, accessing such a value before checking if the err is nil, will lead to a warning.

E.g.

func getString(succeed bool) (*string, error) {
    if succeed {
        str := "Hello World"
        return &str, nil
    }
    return nil, errors.New("Failed") // no warning
}

str, err := getString()
_ = *str // warning: nil dereference
if err != nil {
    return err
}
_ = *str // no warning

Breaking changes

This proposal requires adding warnings in places where there previously weren't any. On the other hand it should not change semantics of any existing code, just add extra diagnostics.

C#s solution to this was to gate the feature behind a compiler switch, which could be enabled at the level of a line, a file, a project, etc. In C# 9.0 this was off by default, and in C# 10 it was enabled by default, giving some time to migrate. By now most maintained projects have this feature enabled, at least for some code.

Go could go for a similar solution here, but I would fully understand if this is too unpalatable.

Template Questions

Would you consider yourself a novice, intermediate, or experienced Go programmer? intermediate What other languages do you have experience with? Significant work experience with C#, Scala, Python. Passing familiarity with a number of others. Would this change make Go easier or harder to learn, and why? It would add an extra concept to learn, but reduce the number of panics encountered by beginners, which will make the learning experience smoother and less frustrating. Overall a bit of a wash. Has this idea, or one like it, been proposed before? A brief search couldn't find anything similar. Who does this proposal help, and why? This should help all users by reducing a common kind of bug. What is the cost of this proposal? This has a very high cost in compiler work. For C# this was the second most expensive feature ever added, after generics. Go is a simpler language, so it should be less expensive, but still a lot of work. It also requires updating all existing code to get the full benefits of this proposal. What is the compile time cost? This requires a flow analysis pass which is the main cost. It's not insignificant, but is a very well understood problem, with lots of algorithms which can do this efficiently as the analysis can be conservative. What is the run time cost? None Can you describe a possible implementation? C#/roslyn would be the best example too look at for possible implementations Is the goal of this change a performance improvement? no Does this affect error handling? Yes, as described above. If so, how does this differ from previous error handling proposals? No connection Is this about generics? No

Other

I have done significant work on nullable reference types in the C# compiler (roslyn) and am friendly with most of the team. If you are interested in looking at roslyn as prior art, I can help or put you in touch with members of the team.

seankhliao commented 2 years ago

Please fill out https://github.com/golang/proposal/blob/master/go2-language-changes.md when proposing language changes

YairHalberstadt commented 2 years ago

Thanks!

I was looking for something like that but couldn't find it. Perhaps needs some SEO :-)

YairHalberstadt commented 2 years ago

Updated the proposal with the Template Questions

aarzilli commented 2 years ago

Non-nilable types in go have been discussed many times. They have always been rejected because there are too many places in the language where zero values are used. For example:

pcostanza commented 2 years ago

In Go, nil is a lot less problematic than in other languages, because it doesn't affect method dispatch:

func (v *someType) foo() {
   if v == nil {
      // do something safe
   }
   ...
}
AndrewHarrisSPU commented 2 years ago

If, hypothetically (1) generics set expectations for a native nullable construct (just like a Nullable[T] sort of thing - this is fairly straightforward w/ generics), and (2) setting aside cultural questions - I'd be curious to know about performance potential, how this played out in C#.

YairHalberstadt commented 2 years ago

I'd be curious to know about performance potential, how this played out in C#.

I'm not sure what you're asking here?

AndrewHarrisSPU commented 2 years ago

I think on two tracks - first, compile-time performance is really important to Go, how expensive is flow analysis? Second, there are places where the semantics of nullable types can theoretically enable some optimizations that are quite a bit smarter than writing the equivalent with e.g. checking for null pointers, or checking if a map key is present. I'm really impressed with Roslyn as far as I've used it, but I don't have a lot of depth - is this something that happens with C# code very naturally?

I'm just in the peanut gallery here, purely just asking out of curiosity.

YairHalberstadt commented 2 years ago

@aarzilli

Non-nilable types in go have been discussed many times.

I assumed there must have been, but I couldn't find any. I hope this proposal at least added something new, and I apologize if it didn't.

They have always been rejected because there are too many places in the language where zero values are used.

In C# there are also numerous type holes, and are a large part of the reason that the feature was punted off time and time again. However one of the great learning points has been that mostly that doesn't matter. 98% of the time, the feature saves you on common bugs, and 2% of the time you have to continue being extra careful, just like you always have been. Solving 98% of the billion dollar mistake is still a whole lotta cash.

What does this program do https://play.golang.org/p/PnAXe5Js9ay

I think this is a perfect example of what I mean. Yes it's technically a type hole, but it's not something people will do in normal code. Don't let perfect be the enemy of good.

How do maps to non-nilable types work https://play.golang.org/p/wEPGPmnBnuo

Much like error types, it will be non-nillable if the user explicitly checks ok, nillable otherwise.

We could decide that in most cases where people index into a map and only extract the value not the ok, that's because they know the key is in there and we can assume it's non-nillable, but I think that's probably not correct.

How do channels to non-nilable types work

I'm not sure specifically of what example you're asking about here.

How do type assertions to non-nilable types work

Any time you make a type assertion you're telling the compiler that you know better than it what the type is. So if you tell the compiler that this is non-nillable, it will believe you. Whether the compiler inserts a check and panics if it is nil is a seperate question - C# decided not to go down this route, on the basis that moving to NullReferenceTypes was enough work as it is, that at the very least they could ensure it made no difference at all to codegen, meaning you don't have to test your code still works, only fix all compiler warnings.

What happens if the first return value of a call to a func()(*string, error) is assigned to an interface before the error check? What is the type of the interface and is it different from the type it would get after the error check?

I assume you're asking about something like this:

func MyFunc() (*MyType, error){}

myType, err := MyFunc()
var myInt MyInterface = myType //warning if the implementation of MyInterface for MyType takes a non nillable reciever

Any warnings you would get you've already got when you assigned myType to myInt, so what you do after that wouldn't matter.

However we can mend the question to something like this.

func MyFunc() (*string, error){}

str, err := MyFunc()
strCopy := str
_ = *strCopy // warning dereference of possible nil dereference
if err != nil {
    return err
}
_ = *str // no warning
_ = *strCopy // do we warn

The flow analysis is conservative and is only so sophisticated. Either it's able to spot that or it isn't. In C# it isn't, but all that means is that if you don't want a compiler warning, and don't want to supress the warning, then you need to refactor your code slightly (which you probably should do anyway!).

Overall my point is:

Sure there's areas where this imperfect. But nothing you've shown me here is worse than the situation in C# where this doesn't have any warning:

var strings = new string[10];
_ = string[0]; // throws NullReferenceException, no warning

or this

struct MyStruct {
    public string Field;
}
_ = new MyStruct().Field.Length; // throws NullReferenceException, no warning

and yet NullableReferenceTypes solve enough problems they've still been tremendously useful and successful!

ianlancetaylor commented 2 years ago

At first glance this looks a lot like #30177.

YairHalberstadt commented 2 years ago

@ianlancetaylor I think the main difference is #30177 tries to make guarantees about the status of non-nillable types, changing code Gen in the process. This is not practical for a more than 10 year old language.

This proposal is purely about adding diagnostics to try and warn when a value is nil. It doesn't try to be 100% accurate. It only needs language changes in order to allow the non-nil annotation - in theory the diagnostics could be implemented purely externally to the compiler.

It also draws on the experience of C# and Roslyn to hello answer a lot of the thorny questions in this domain.

YairHalberstadt commented 2 years ago

@pcostanza

In Go, nil is a lot less problematic than in other languages, because it doesn't affect method dispatch:

func (v *someType) foo() {
   if v == nil {
      // do something safe
   }
}

In any language you could always check for nil before you did something illegal with a value. in practice people don't. Now you're right that method dispatch in particular can have a nil receiver some of the time (i.e. if the receiver is a pointer), but what tends to happen is if you have a value which you assume isn't nil and it is it will eventually blow up. If anything this particular issue makes the situation worse, because things will blow up later, further away from the original source of the issue.

DeedleFake commented 2 years ago

@pcostanza

That goes a long way towards helping with the problem, and it's definitely less problematic in Go than it is in, say, Java, but there are still similar cases where it's an issue:

Most other usages of nil have sane defaults:

@YairHalberstadt

The Go compiler never produces warnings. It either produces an error and fails or it works. There are currently no exceptions to that. Everything that falls too squarely into the realm of warnings gets put in go vet, but there's no way that language changes will be made just for the purpose of helping a heuristic in a linter. I think your best bet is a combination of a third-party tool and magic comments, not something in the main Go toolchain itself.

All of that being said, I think that the compatibility problem could be alleviated somewhat if the changes are inverted so that things are marked as non-nillable, rather than the other way around, thus leaving existing code as is.

A bit of a tangent here, but I'd also like to beg to differ about Rust having avoided the null problem. It's true that Option<T> makes things a bit more explicit, but I've seen a lot of code that just spams unwrap() everywhere and you're right back to square one. Just because it isn't called null or nil or something else doesn't mean that it isn't. A lot of languages that I've seen have similar problems, such as Kotlin with !!. I do think that the existence of ?. and ?: help a lot, though. I tend to cut Kotlin some slack on its oddities because of the position that it's in with Java, too.

urandom commented 2 years ago

In Go, nil is a lot less problematic than in other languages, because it doesn't affect method dispatch:

func (v *someType) foo() {
   if v == nil {
      // do something safe
   }
   ...
}

That's nice and all, but the do something safe path is almost never available, which is why most code does not do this, including the stdlib:

https://play.golang.org/p/Q_wU4fMOp7G

beoran commented 2 years ago

Go already has non nullable types. Just don't use pointers, but prefer value types in stead.

https://play.golang.org/p/JRE-bsK1GMt

The real problem is not the fact that there are null pointers, but that pointers are used too often. Unlike many languages, Go language makes value semantics feasible.

urandom commented 2 years ago

@beoran there are legitimate reasons to use pointers, while still having nil values be undesirable. waiving that away by saying "use values" isn't really helpful.

ianlancetaylor commented 2 years ago

As far as we can tell this proposal is not backward compatible. Today many types are nillable. Adding a notation to mark which types are nillable doesn't seem possible today. That said, perhaps we could instead add an explicit marker for types that are not nillable.

As mentioned above, there are issues with zero values. Go relies on zero values in many ordinary cases, such as declaring variables, calling make with a slice element type that is or contains non-nillable types, append on such a slice type. Banning all uses of these ordinary operations for non-nillable types would be a major change to the language.

It may be troublesome to add flow control specifications to the language specification. And it would have to be added to the spec; we don't want different compilers to give different nilability errors, as that would be mean that Go programs would build with some compilers but not others.

ianlancetaylor commented 2 years ago

Based on the discussion above, this is a likely decline. Leaving open for four weeks for final comments.

ianlancetaylor commented 2 years ago

No further comments.