dotnet / roslyn

The Roslyn .NET compiler provides C# and Visual Basic languages with rich code analysis APIs.
https://docs.microsoft.com/dotnet/csharp/roslyn-sdk/
MIT License
19.01k stars 4.03k forks source link

Roslyn can't decide whether variable is nullable or not in switch guard clauses #37410

Open TessenR opened 5 years ago

TessenR commented 5 years ago

Steps to Reproduce:

#nullable enable
public class C {
    void M(object obj)
    {
        string? s = "";

        switch (obj)
        {
            case IA _ when MA(out s):
                break;

            case IB _ when MB(s, out var x)
                        && x.ToString() != "" // CS8602: Dereference of a possibly null reference.
                        && (x = null) != "":  // CS8600: Converting null literal or possible null value to non-nullable type.
            case ID _:  // <-- change this pattern to IC to get another set of warnings
                x = null; // CS8600: Converting null literal or possible null value to non-nullable type.

                break;
        }
    }

    bool MA(out string? s) { s = ""; return false; }
    bool MB<T>(T t, out T o) { o = t; return false; }
}

interface IA { }
interface IB { }
interface IC { }
interface ID : IA { }

interface I<T> { }

Expected Behavior: x is either of type string? or string but not both.

Actual Behavior: Roslyn behaves as if the variable is both string? and string in the guard clause disallowing both its dereferences and its assignments with nulls.

Notes You can also get interesting combinations of warnings with generics:

#nullable enable
public class C {
    void M(object obj)
    {
        string? s = "";
        switch (obj)
        {
            case IA _ when MA(out s):
                break;

            case IB _ when MB(s, out var x)
                && (x = new D<string>()) != null   // CS8619: Nullability of reference types in value of type 'D<string>' doesn't match target type 'I<string?>'.
                && (x = new D<string?>()) != null: // CS8619: Nullability of reference types in value of type 'D<string?>' doesn't match target type 'I<string>'.
            case IC _:
                x = new D<string>();
                x = new D<string?>(); // CS8619: Nullability of reference types in value of type 'D<string?>' doesn't match target type 'I<string>'.
                break;

        }
    }

    bool MA(out string? s) { s = ""; return false; }
    bool MB<T>(T t, out I<T> o) { o = new D<T>(); return false; }
}

interface IA { }
interface IB { }
interface IC : IA { }

interface I<T> { }
class D<T> : I<T> { }

The root cause for such behavior seems to be that these guard clauses are processed multiple times as parts of the decision directed acyclic graphs of the switch statements and variables s has different states over multiple passes. According to this the analysis behavior actually makes sense since the guard clause might be visited with s being non-null and x inferred to non-nullable string in one pass and with s being null and x inferred to string? in another pass.

However NullableWalker doesn't just produce warnings but also uses NullabilityRewriter to update types of every expression with inferred nullability. Currently it updates usages of x in the guard clauses to some types with some nullability that was observed during the last pass. This behavior seems unreliable if Roslyn will ever request the types of these expressions for any purpose. Is this intended? What should be the inferred type of variables x in the examples above?

/cc @jcouv

Version Used:

Branch master (21 Jun 2019)
Latest commit 898bed by Heejae Chang:
added NFW to get some data on incremental parsing bug where source si� (#36620)

* added NFW to get some data on incremental parsing bug where source size and tree size is different

* more comments
jcouv commented 5 years ago

Tagging @gafter @cston Chuck reminded me that when clauses only apply to the specific branch. So I would expect var x to always be inferred to string!.

gafter commented 5 years ago

@jcouv @cston s is assigned null in the when clause of the previous switch block, which is executed first. So it is actually and definitely null in the case where x is declared.

TessenR commented 5 years ago

@jcouv @gafter IA and IB are not related so the previous guard clause might or might not be executed before the guard clause of IB. This is actually reflected in the switch statement's decision dag which contains multiple nodes which represent the guard clause of IB. In this case the dag contains two guard clauses of IB, one of which has guard clause of IA executed before it (State 3 in the dag dump below) and one that doesn't (State 8). So it isn't definitely null but it might be null.

Since NullableWalker visits guard expressions in its LearnFromDecisionDag method it visits the guard expression twice analyzing it with both nullability options, hence technically correct double warnings.

However NullableWalker also stores analyzedNullabilityMapOpt which contains inferred nullability for every expression in the containing method. Since it visits the guard clause multiple times it just overwrites nullability of expressions in guard clauses with the nullability it observed during the last pass. This is the reason why I created the issue - it seems dangerous to just update expressions' types with some random nullability. Maybe NullableWalker should store and just join all possible entry contexts before visiting guard clauses?

Here is the decision dag for the original example:

State 0
  Test: ?DagTypeTest t0 is C.IA
  WhenTrue: 1
  WhenFalse: 7
State 1
  WhenClause: MA(out s)
  WhenTrue: 6
  WhenFalse: 2
State 2
  Test: ?DagTypeTest t0 is C.IB
  WhenTrue: 3
  WhenFalse: 4
State 3
  WhenClause: MB(s, out var x)
                        && x.ToString() != " // CS8602: Dereference of a possibly null reference.
                        && (x = null) != ":  // CS8600: Converting null literal or possible null value to non-nullable type.
  WhenTrue: 10
  WhenFalse: 4
State 4
  Test: ?DagTypeTest t0 is C.ID
  WhenTrue: 5
  WhenFalse: 9
State 5
  Case: case ID _:
State 6
  Case: case IA _ when MA(out s):
State 7
  Test: ?DagTypeTest t0 is C.IB
  WhenTrue: 8
  WhenFalse: 9
State 8
  WhenClause: MB(s, out var x)
                        && x.ToString() != " // CS8602: Dereference of a possibly null reference.
                        && (x = null) != ":  // CS8600: Converting null literal or possible null value to non-nullable type.
  WhenTrue: 10
  WhenFalse: 9
State 9
  Case: switch (obj)
        {
            case IA _ when MA(out s):
                break;

            case IB _ when MB(s, out var x)
                        && x.ToString() != " // CS8602: Dereference of a possibly null reference.
                        && (x = null) != ":  // CS8600: Converting null literal or possible null value to non-nullable type.
            case ID _:  // <-- change this pattern to IC to get another set of warnings
                x = null; // CS8600: Converting null literal or possible null value to non-nullable type.

                break;
        }
State 10
  Case: case IB _ when MB(s, out var x)
                        && x.ToString() != " // CS8602: Dereference of a possibly null reference.
                        && (x = null) != ":  // CS8600: Converting null literal or possible null value to non-nullable type.
RikkiGibson commented 3 years ago

It wasn't clear to me whether this issue is still present. However the CS8600 warnings are no longer produced on SharpLab.

TessenR commented 3 years ago

@RikkiGibson It's because var is now always nullable so it doesn't report warnings about null assignments. Roslyn still thinks the variable is of different types in this expression.

You can check this sightly modified repro:

#nullable enable
public class C {
    void M(object obj)
    {
        string? s = "";

        switch (obj)
        {
            case IA _ when MA(out s):
                break;

            case IB _ when MB(s, out var x)
                        && x.Prop.ToString() != "" // CS8602: Dereference of a possibly null reference.
                        && (x.Prop = null) != "":  // CS8625: Converting null literal or possible null value to non-nullable type.
            case ID _:  // <-- change this pattern to IC to get another set of warnings
                x.Prop = null; // CS8600: Converting null literal or possible null value to non-nullable type.

                break;
        }
    }

    bool MA(out string? s) { s = ""; return false; }
    bool MB<T>(T t, out I<T> o) { o = new C<T>(t); return false; }
}

interface IA { }
interface IB { }
interface IC { }
interface ID : IA { }

interface I<T> { T Prop { get; set; } }
class C<T> : I<T> { public T Prop { get; set; } public C(T t) => Prop = t; }
TessenR commented 3 years ago

Or you can check that the variable x in this example is both C<string> and C<string?> in the guard clause according to warning messages.

Note that after the gurd clause it's C<string> which seems to be choosen by chance.

#nullable enable
public class C {
    void M(object obj)
    {
        string? s = "";

        C<string> cOfNotNullable = new C<string>("");
        C<string?> cOfNullable = new C<string?>(null);
        switch (obj)
        {
            case IA _ when MA(out s):
                break;

            case IB _ when MB(s, out var x)
                        && (x = cOfNotNullable) != null // CS8619: Nullability of reference types in value of type 'C<string>' doesn't match target type 'I<string?>
                        && (x = cOfNullable) != null :  // CS8619: Nullability of reference types in value of type 'C<string?>' doesn't match target type 'I<string>'.
            case ID _:  // <-- change this pattern to IC to get another set of warnings
                x = cOfNotNullable;
                x = cOfNullable; // CS8619: Nullability of reference types in value of type 'C<string?>' doesn't match target type 'I<string>'.

                break;
        }
    }

    bool MA(out string? s) { s = ""; return false; }
    bool MB<T>(T t, out I<T> o) { o = new C<T>(t); return false; }
}

interface IA { }
interface IB { }
interface IC { }
interface ID : IA { }

interface I<T> { T Prop { get; set; } }
class C<T> : I<T> { public T Prop { get; set; } public C(T t) => Prop = t; }
TessenR commented 3 years ago

@RikkiGibson Note that this choice to keep non-nullable version after the guard clause can lead to missing errors e.g. the following code has no warning but crashes at runtime with a NullReferenceException

Compile and run the following code:

#nullable enable
public class C {
    static void Main()
    {
      M(new Test());
    }

    static void M(object obj)
    {
        string? s = "";

        C<string> cOfNotNullable = new C<string>("");
        C<string?> cOfNullable = new C<string?>(null);
        switch (obj)
        {
            case IA _ when MA(out s):
                break;

            case IB _ when MB(s, out var x) && Local():
            case ID _:

                break;

                bool Local()
                {
                  x.Prop.ToString();
                  return false;
                }
        }
    }

    static bool MA(out string? s) { s = null; return false; }
    static bool MB<T>(T t, out I<T> o) { o = new C<T>(t); return true; }
}

interface IA { }
interface IB { }
interface IC { }
interface ID : IA { }

interface I<T> { T Prop { get; set; } }
class C<T> : I<T> { public T Prop { get; set; } public C(T t) => Prop = t; }

class Test : IA, IB { }
slavanap commented 1 year ago

@RikkiGibson I have similar but simpler scenario that produces the same unnecessary warning CS8600: Converting null literal or possible null value to non-nullable type.

Instead of creating a separate issue I'll just post it here.

#nullable enable
static bool f() => true;

Dictionary<string, string> d = new();
string s;
//                                  v--- warning CS8600: Converting null literal or possible null value to non-nullable type.
if (f() && d.TryGetValue("key", out s))
    Console.WriteLine(s);
RikkiGibson commented 1 year ago

I think a warning is always expected on out s in d.TryGetValue("key", out s). TryGetValue assigns a maybe-null string, which isn't allowed to go into a non-nullable string variable. Fix is to change the declaration to string? s, or to use out var s in the call.

slavanap commented 1 year ago

@RikkiGibson yes it assigns, but the return bool value is checked, so it should be not null at the time of Console.WriteLine, because Dictionary value declaration doesn't allow it, doesn't it?

gafter commented 1 year ago

The problem isn't the Console.WriteLine. The problem is possibly placing a null value in s, which was declared not to accept a null value.

slavanap commented 1 year ago

Ok, I understand. It was just inconvenient to receive it because s is not used out of if true branch. I definitely can live with out var.