dotnet / roslyn

The Roslyn .NET compiler provides C# and Visual Basic languages with rich code analysis APIs.
https://docs.microsoft.com/dotnet/csharp/roslyn-sdk/
MIT License
18.91k stars 4.01k forks source link

extract method fails to pass by reference #41289

Open vsfeedback opened 4 years ago

vsfeedback commented 4 years ago

This issue has been moved from a ticket on Developer Community.


Extract a method as indicated in the following code:

using System;
using System.Threading;
using System.Threading.Tasks;
using Microsoft.VisualStudio.TestTools.UnitTesting;

[TestClass]
public class TestClass1
{
    [TestMethod]
    public void TestMethod1()
    {
        var x = 0;
        Task.Run(() =>
        {
            Thread.Sleep(TimeSpan.FromSeconds(1));
            x++;
        });

{
            // Extract Method start
            Thread.Sleep(TimeSpan.FromSeconds(2));
            Assert.AreEqual(1, x);
            // Extract Method end
        }
    }
}

As it is, the test passes. If you extract the indicated block, then the test fails. Extract Method should add a ref to the parameter on the new method.


Original Comments

Visual Studio Feedback System on 9/17/2019, 04:06 AM:

We have directed your feedback to the appropriate engineering team for further evaluation. The team will review the feedback and notify you about the next steps.

Visual Studio Feedback System on 9/19/2019, 03:14 PM:

This issue is currently being investigated. Our team will get back to you if either more information is needed, a workaround is available, or the issue is resolved.


Original Solutions

(no solutions)

JayBazuzi commented 4 years ago

I see this is tagged as "Feature Request" but I see it as a defect: this "refactoring" will change the behavior of the code, which refactorings must not do.

jmarolf commented 4 years ago

which refactorings must not do.

This is not true. If refactorings were not allowed to change any code behavior then we could never offer them. Instead refactorings are allowed to change the semantics of the thing you asked us to do.

Extracting a method may change the semantics of calling that method. It should not change the observability of private members or anything else unrelated to what the refactoring is trying to do.

jmarolf commented 4 years ago

Refactorings are a means to get things done faster than manual code editing but only offering them in cases where we've run a 3SAT solver and proven that no change in behavior is ever observable is a non-goal. Instead refactorings should only change the behavior of the thing that you asked us to do.

ryzngard commented 4 years ago

After some discussion, I'm moving this to a bug and marking as help wanted. Here's some similar examples with behavior we expect to be classified as fixed.

In short, if a lifted variable is modified in the same scope as the extraction point, extract method should use the in keyword to make sure that the value is at least passed by read only reference.

If the value is also modified within the extracted method, it should use ref.

Value Type Example

Before

private static void M()
{
    var x = 0;
    Action a = () => {
        x++;
    };
    // Extract Method start
    a();
    if (1 == x)
    {
        Console.WriteLine(x);
    }
    // Extract Method end
}

After

private static void M()
{
    var x = 0;
    Action a = () => {
        x++;
    };
    // Extract Method start
    NewMethod(a, x);
    // Extract Method end
}
private static void NewMethod(Action a, in int x)
{
    a();
    if (1 == x)
    {
        Console.WriteLine(x);
    }
}

Reference Type Example

Before

private static void M()
{
    var x = new object();
    var y = x;

    Action a = () =>
    {
        x = new object();
    };
    // Extract Method start
    a();
    if (y != x)
    {
        Console.WriteLine("Test");
    }
    // Extract Method end
}

After

private static void M()
{
    var x = new object();
    var y = x;

    Action a = () =>
    {
        x = new object();
    };
    // Extract Method start
    NewMethod(x, y, a);
    // Extract Method end
}

private static void NewMethod(in object x, object y, Action a)
{
    a();
    if (y != x)
    {
        Console.WriteLine("Test");
    }
}
CyrusNajmabadi commented 4 years ago

@JayBazuzi hello old friend!

Right now this is currently in the 'by design' bucket in-as-far as the 'correct' behavior for extract method was deemed too unpleasant for the common case. i.e. to be 'safe/correct', it would have to use 'ref' in far too many areas where people would not want it.

In short, if a lifted variable is modified in the same scope as the extraction point, extract method should use the in keyword to make sure that the value is at least passed by read only reference.

I'm very wary of this. The threading case is an extreme corner case. Whereas extracting a value that is written elsewhere in a function would be very common.

--

Note: we have a flag to control this somewhat with:

image

If we were to take the above described fix, i would want it gated behind a flag as well to not impact the common case where people would expect that passing a primitive should not need to happen by-ref

ryzngard commented 4 years ago

If we were to take the above described fix, i would want it gated behind a flag as well to not impact the common case where people would expect that passing a primitive should not need to happen by-ref

Potentially a flag would be good. Keep in mind that this proposal is only applicable for cases where local variables are lifted and modified. Variables written elsewhere in a function is not the same as lifting a variable to be modified.

ryzngard commented 4 years ago

Here's a good test case that shouldn't change with the proposed fix https://github.com/dotnet/roslyn/blob/4d19a77445b392f53d4af55ddaf48208b5205df4/src/EditorFeatures/CSharpTest/ExtractMethod/ExtractMethodTests.cs#L903

            var code = @"using System;
using System.Collections.Generic;
using System.Linq;
class Program
{
    void Test()
    {
        int i;
        [|int b = 10;
        if (b < 10)
        {
            i = 5;
        }|]
        i = 6;
        Console.WriteLine(i);
    }
}";
CyrusNajmabadi commented 4 years ago

Variables written elsewhere in a function is not the same as lifting a variable to be modified.

Understood. But 'lifitng' (i assume you mean 'captured') is not really sufficient for me. In most cases, it's not an issue. The real strange case here is the specific use of threading constructs. Those constructs break the normal expectation of locals in a method, namely that only your thread could ever modify them. Threading breaks that (primarily because the locals become fields and the fields are touched by different threads). This is def a corner of a corner case.

For example, i would def not want to suddenly pass-by-ref just because i had a lambda/query that happened to use the local as well.

sharwell commented 4 years ago

@CyrusNajmabadi capturing the local is not a sufficient condition. It needs to be both captured and have a value written to it in a lambda or local function. Further restrictions may be possible, e.g. special handling for cases where local functions are only called directly and not converted to delegates.

JayBazuzi commented 4 years ago

an extreme corner case

Perhaps. But keep in mind that this issue was discovered in the real world, not academically. A developer extracted a method, checked it in, and deployed to production. Later they got a bug report and traced it back to this refactoring. It's real, and it's a subtle product bug that was created by a refactoring tool.

We count on automated refactorings to catch the things that humans can't catch.

CyrusNajmabadi commented 4 years ago

Note:

  1. The original code is buggy itself. The language and runtime are free to operate in a manner where it might never work. Specifically, absent actual memory fencing primitives, there is no guarantee that one thread will see the mutation of a value made by another thread. The compilers here were free to optimize things such that the code would not work.
  2. As Jon mentioned, the refactoring bar is not that it doesn't change meaning. As it happens, most refactorings we have (outside of trivial syntactic ones) do have some cases where meaning could change. This approach is one that was settled on because being stricter about no change in meaning ended up causing us to produce code that was felt to be suboptimal for users much of the time.

That said, in this case, I think it might be ok to limit a fix to just checking for mutations to the variable in another closure and potentially passing by ref then.