Handling of non-nullable fields for classes with "Init" method

dotnet / roslyn

The Roslyn .NET compiler provides C# and Visual Basic languages with rich code analysis APIs.

https://docs.microsoft.com/dotnet/csharp/roslyn-sdk/

MIT License

18.93k stars 4.02k forks source link

Handling of non-nullable fields for classes with "Init" method #39090

Open jacekbe opened 4 years ago

jacekbe commented 4 years ago

I have a bunch of related classes that are constructed in 2 steps: 1) Object is instantiated, constructor is called 2) Init() method is called

The problem is that some fields are initialized in Init() method and are guaranteed to be non-null afterwards. However, the compiler complains that these fields are not initialized so I'm forced to either mark them as nullable (which I don't like) or do some tricks like explicitly initializing them with null! in the constructor to silence the compiler. Have you considered allowing to annotate certain methods as "init methods"? The effect would be that non-null fields/properties would have to be initialized either in the constructor or in init method(s). In cases when Init is in fact part of the object construction no other method should be called in between (maybe apart from setting some properties). This means it's usually safe to leave such fields/properties as not-null and once Init method is called it's safe to access them.

ronnygunawan commented 4 years ago

For silencing the compiler part, you can do this in 8.0:

class Foo
{
    [NotNull]
    public string? Text { get; set; }

    public Foo()
    {
    }

    public Init()
    {
        Text = "";
    }
}

CyrusNajmabadi commented 4 years ago

and once Init method is called it's safe to access them.

How does the compiler know that Init has been called? Imagine passing around some random Foo type. How would it know if that instance had had init called on it?

This is different from constructors. With constructors you know it must have been called. Because that's the only way to get a non-null instance in the first place.

jacekbe commented 4 years ago

How does the compiler know that Init has been called? Imagine passing around some random Foo type. How would it know if that instance had had init called on it?

I do not expect that compiler will guarantee this. If we have a class where calling Init after object instantiation is part of the contract then developer is responsible for guaranteeing that no other method is called before object is fully initialized. Calling some other method before Init would be considered misuse of the API. So in this regard noting changes - there would be no language-level measures that prevent people from such misuse. What I'm trying to achieve is that when API is used correctly then after full initialization of the object non-null fields are guaranteed to be non-null with no special actions required apart from annotating Init method.

Joe4evr commented 4 years ago

Related: dotnet/csharplang#2328 (by the way, this is something that's more suited to be at the language repo than the compiler repo 😉)

A solution for both paradigms can theoretically be reached. Given a class:

public class Foo
{
    [LateInitialize] // exact name TBD
    public string Text { get; set; }
}

the linked proposal is to have the compiler prevent escaping an instance of new Foo() if it doesn't see the Text property being assigned a value (note that the compiler already does this analysis with structs). This also addresses @CyrusNajmabadi's concern since the place of creation is the exact scope this should be done. This particular approach can be extended to this proposal in a similar fashion:

public class Foo
{
    public string Text { get; set; }

    [LateInitialize]
    public void Init()
    {
        //....
    }
}

The compiler could detect this pattern and just like the property case, not allow an instance of new Foo() to escape before the annotated method is called.

jacekbe commented 4 years ago

Yes, you are right about wrong repo - it should belong to the csharplang. Long time ago these were combined and I wasn't aware/forgot about the split.

Thanks for the link. I will read it tomorrow (now it's late where I live) to see if it applies to my case. Unfortunately my pattern is a bit complicated. It's more like: 1) Create an object prototype 2) Depending on program input - clone and set properties (might be many combinations and therefore many clones of such object) 3) Init 4) Do proper processing

The thing is that Initis called in a different place and the caller provides additional information to initialize remaining fields - that's why it's not really possible to have all initialization logic in constructor or even in method when object is constructed.

jnm2 commented 4 years ago

This is coming up frequently enough to be annoying across every codebase (open-source and proprietary) that I've used NRTs with so far. I've been doing = null!.

Flow control could in theory prove that the constructor was calling a method which unconditionally initializes the remaining fields. No new constructs like [LateInitialize] should be required.