dotnet / roslyn

The Roslyn .NET compiler provides C# and Visual Basic languages with rich code analysis APIs.
https://docs.microsoft.com/dotnet/csharp/roslyn-sdk/
MIT License
19.07k stars 4.04k forks source link

Proposal: Destructible Types #161

Closed stephentoub closed 7 years ago

stephentoub commented 9 years ago

Background

C# is a managed language. One of the primary things that's “managed” is memory, a key resource that programs require. Programs are able to instantiate objects, requesting memory from the system, and at some point later when they're done with the memory, that memory can be reclaimed automatically by the system's garbage collector (GC). This reclaiming of memory happens non-deterministically, meaning that even though some memory is now unused and can be reclaimed, exactly when it will be is up to the system rather than being left to the programmer to determine. Other languages, in particular those that don't use garbage collection, are more deterministic in when memory will be reclaimed. C++, for example, requires that developers explicitly free their memory; there is typically no GC to manage this for the developer, but that also means the developer gets complete control over when resources are reclaimed, as they're handling it themselves.

Memory is just one example of a resource. Another might be a handle to a file or to a network connection. As with any resource, a developer using C++ needs to be explicit about when such resources are freed; often this is done using a “smart pointer,” a type that looks like a pointer but that provides additional functionality on top of it, such as keeping track of any outstanding references to the pointer and freeing the underlying resource when the last reference is released.

C# provides multiple ways of working with such “unmanaged” resources, resources that, unlike memory, are not implicitly managed by the system. One way is by linking such a resource to a piece of memory; since the system does know how to track objects and to release the associated memory after that object is no longer being referenced, the system allows developers to piggyback on this and to associate an additional piece of logic that should be run when the object is collected. This logic, known as a “finalizer,” allows a developer to create an object that wraps an unmanaged resource, and then to release that resource when the associated object is collected. This can be a significant simplification from a usability perspective, as it allows the developer to treat any resource just as it does memory, allowing the system to automatically clean up after the developer.

However, there are multiple downsides to this approach, and some of the biggest reliability problems in production systems have resulted from an over-reliance on finalization. One issue is that the system is managing memory, not unmanaged resources. It has heuristics that help it to determine the appropriate time to clean up memory based on the system's understanding of the memory being used throughout the system, but such a view of memory doesn't provide an accurate picture about any pressures that might exist on the associated unmanaged resources. For example, if the developer has allocated but then stopped using a lot of file-related objects, unless the developer has allocated enough memory to trigger the garbage collector to run, the system will not know that it should run the garbage collector because it doesn't know how to monitor the “pressure” on the file system. Over the years, a variety of techniques have been developed to help the system with this, but none of them have addressed the problem completely. There is also a performance impact to abusing the GC in this manner, in that allocating lots of finalizable objects can add a significant amount of overhead to the system.

The biggest issue with relying on finalizers is the non-determinism that results. As mentioned, the developer doesn't have control over when exactly the resources will be reclaimed, and this can lead to a wide variety of problems. Consider an object that's used to represent a file: the object is created when the file is opened, and when the object is finalized, the file is closed. A developer opens the file, manipulates it, and then releases the object associated with it; at this point, the file is still open, and it won't be closed until some non-deterministic point in the future when the system decides to run the garbage collector and finalize any unreachable objects. In the meantime, other code in the system might try to access the file, and be denied, even though no one is actively still using it.

To address this, the .NET Framework has provided a means for doing more deterministic resource management: IDisposable. IDisposable is a deceptively simple interface that exposes a single Dispose method. This method is meant to be implemented by an object that wraps an unmanaged resource, either directly (a field of the object points to the resource) or indirectly (a field of the object points to another disposable object), which the Dispose method frees. C# then provides the 'using' construct to make it easier to create resources used for a particular scope and then freed at the end of that scope:

using (var writer = new StreamWriter("file.txt")) { // writer created
    writer.WriteLine("hello, file");
}                                                   // writer disposed

Problem

While helpful in doing more deterministic resource management, the IDisposable mechanism does suffer from problems. For one, there's no guarantee made that it will be used to deterministically free resources. You're able to, but not required to, use a 'using' to manage an IDisposable instance.

This is complicated further by cases where an IDisposable instance is embedded in another object. Over the years, FxCop rules have been developed to help developers track cases where an IDisposable goes undisposed, but the rules have often yielded non-trivial numbers of both false positives and false negatives, resulting in the rules often being disabled.

Additionally, the IDisposable pattern is notoriously difficult to implement correctly, compounded by the fact that because objects may not be deterministically disposed of via IDisposable, IDisposable objects also frequently implement finalizers, making the pattern that much more challenging to get right. Helper classes (like SafeHandle) have been introduced over the years to assist with this, but the problem still remains for a large number of developers.

Solution: Destructible Types

To address this, we could add the notion of "destructible types" to C#, which would enable the compiler to ensure that resources are deterministically freed. The syntax for creating a destructible type, which could be either a struct or a class, would be straightforward: annotate the type as 'destructible' and then use the '~' (the same character used to name finalizers) to name the destructor.

public destructible struct OutputMessageOnDestruction(string message)
{
    string m_message = message;

    ~OutputMessageOnDestruction() // destructor
    {
        if (message != null)
            Console.WriteLine(message);
    }
}

An instance of this type may then be constructed, and the compiler guarantees that the resource will be destructed when the instance goes out of scope:

public void Example()
{
    var omod = new OutputMessageOnDestruction("Destructed!");
    SomeMethod();
} // 'omod' destructed here

No matter what happens in SomeMethod, regardless of whether it returns successfully or throws an exception, the destructor of 'omod' will be invoked as soon as the 'omod' variable goes out of scope at the end of the method, guaranteeing that “Destructed!” will be written to the console.

Note that it's possible for a destructible value type to be initialized to a default value, and as such the destruction could be run when none of the fields have been initialized. Destructible value type destructors need to be coded to handle this, as was done in the 'OutputMessageOnDestruction' type previously by checking whether the message was non-null before attempting to output it.

public void Example()
{
    OutputMessageOnDestruction omod = default(OutputMessageOnDestruction);
    SomeMethod();
} // default 'omod' destructed here

Now, back to the original example, consider what would happen if 'omod' were stored into another variable. We'd then end up with two variables effectively wrapping the same resource, and if both variables were then destructed, our resource would effectively be destructed twice (in our example resulting in “Destructed!” being written twice), which is definitely not what we want. Fortunately, the compiler would ensure this can't happen. The following code would fail to compile:

OutputMessageOnDestruction omod1 = new OutputMessageOnDestruction("Destructed!");
OutputMessageOnDestruction omod2 = omod1; // Error: can't copy destructible type

The compiler would prevent such situations from occurring by guaranteeing that there will only ever be one variable that effectively owns the underlying resource. If you want to assign to another variable, you can do that, but you need to use the 'move' keyword (#160) to transfer the ownership from one to the other; this effectively performs the copy and then zeroes out the previous value so that it's no longer usable. In compiler speak, a destructible type would be a "linear type," guaranteeing that destructible values are never inappropriately “aliased”.

OutputMessageOnDestruction omod1 = new OutputMessageOnDestruction("Destructed!");
OutputMessageOnDestruction omod2 = move omod1; // Ok, 'omod1' now uninitialized; won't be destructed

This applies to passing destructible values into method calls as well. In order to pass a destructible value into a method, it must be 'move'd, and when the method's parameter goes out of scope when the method returns, the value will be destructed:

void SomeMethod(OutputMessageOnDestruction omod2)
{
    ...
} // 'omod2' destructed here
...
OutputMessageOnDestruction omod1 = new OutputMessageOnDestruction("Destructed!");
SomeMethod(move omod1); // Ok, 'omod1' now uninitializedl; won't be destructed

In this case, the value needs to be moved into SomeMethod so that SomeMethod can take ownership of the destruction. If you want to be able to write a helper method that works with a destructible value but that doesn't assume ownership for the destruction, the value can be passed by reference:

void SomeMethod(ref OutputMessageOnDestruction omod2)
{
   ...
} // 'omod2' not destructed here
…
OutputMessageOnDestruction omod1 = new OutputMessageOnDestruction("Destructed!");
SomeMethod(ref omod1); // Ok, 'omod1' still valid

In addition to being able to destructively read a destructible instance using 'move' and being able to pass a destructible instance by reference to a method, you can also access fields of or call instance methods on destructible instances. You can also store destructible instances in fields of other types, but those other types must also be destructible types, and the compiler guarantees that these fields will get destructed when the containing type is destructed.

destructible struct WrapperData(SensitiveData data)
{
    SensitiveData m_data = move data; // 'm_data' will be destructed when 'this' is destructed
    …
}
destructible struct SensitiveData { … }

There would be a well-defined order in which destruction happens when destructible types contain other destructible types. Destructible fields would be destructed in the reverse order from which the fields are declared on the containing type. The fields of a derived type are destructed before the fields of a base type. And user-defined code runs in a destructor before the type's fields are destructed.

Similarly, there'd be a well-defined order for how destruction happens with locals. Destructible locals are destructed at the end of the scope in which they are created, in reverse declaration order. Further, destructible temporaries (destructible values produced as the result of an expression and not immediately stored into a storage location) would behave exactly as a destructible locals declared at the same position, but the scope of a destructible temporary is the full expression in which it is created.

Destructible locals may also be captured into lambdas. Doing so results in the closure instance itself being destructible (since it contains destructible fields resulting from capturing destructible locals), which in turn means that the delegate to which the lambda is bound must also be destructible. Just capturing a local by reference into a closure would be problematic, as it would result in a destructible value being accessible both to the containing method and to the lambda. To deal with this, closures may capture destructible values, but only if an explicit capture list (#117) is used to 'move' the destructible value into the lambda (such support would also require destructible delegate types):

OutputMessageOnDestruction omod = new OutputMessageOnDestruction("Destructed!");
DestructibleAction action = [var localOmod = move omod]() => {
    Console.WriteLine("Action!");
}

The destructible types feature would enable a developer to express some intention around how something should behave, enabling the compiler to then do a lot of heavy lifting for the developer in making sure that the program is as correct-by-construction as possible. Developers familiar with C++ should feel right at home using destructible types, as it provides a solid Resource Acquisition Is Initialization (RAII) approach to ensuring that resources are properly destructed and that resource leaks are avoided.

scalablecory commented 9 years ago

I would prefer a way that allows me to apply this not just to new types but to existing code.

One issue I see is this essentially mimics std::unique_ptr, but without the ability to get() a reference that can easily exist in multiple places.

These two concerns are showstoppers for me and so I would not vote to include this in its current form.

Another issue as-is with this proposal is that when reading code, there's no obvious way to determine that a variable will have side effects when it goes out of scope. If there's not a move right next to it, there'd be no way to know.

I believe something closer to this would get about 90% of the way there and be a lot more usable: say a way to mark an IDisposable as unique, and keywords to move() and get() it:

var !con = new SqlConnection(...);
var !cmd = con.CreateCommand();
var !reader = cmd.ExecuteReader();

DbDataReader errorReader = reader; // error: did not move or get.
DbDataReader !movedReader = move reader; // nulls out reader
DbDataReader normalReader = get reader; // preserves movedReader

This would give scope-bound deterministic disposal, single ownership safety, obvious code readability, be immediately useful when using existing code, and I believe could be implemented without changing the VM similar to Nullable.

Additional safety regarding single ownership could be had by adding a weak reference designator that forces explicit ownership to exist elsewhere:

DbDataReader ^weakReader = reader; // implicit.

But, this may be of limited usefulness if "get reader" allows getting a raw instance. (Which, I think is very important to allow)

ryancerium commented 9 years ago

What if you used C++ stack allocation construction syntax to visually separate garbage collected objects from destructible objects? Would that allow any object to be destructible then? I think you'd need CLR support at that point though.

OutputMessageOnDestruction omod("Hello world!");
RichiCoder1 commented 9 years ago

:+1: @scalablecory Correct me if I'm wrong, but how much call is there for something equivalent to get()? What scenarios would it allow outside just ref passing into methods? Having it be very strict seems a plus in C# specifically.

svick commented 9 years ago

If you want to be able to write a helper method that works with a destructible value but that doesn't assume ownership for the destruction, the value can be passed by reference

Would this work together with readonly parameters (#115), so that I can create a method that works with the destructible value, but can't "steal" it, by making the parameter readonly ref?

You can also store destructible instances in fields of other types, but those other types must also be destructible types

Does that mean that array or List<T> of destructible type wouldn't be allowed? And that you would need something like DestructibleSinglyLinkedList<T> to have a collection of them?

MrJul commented 9 years ago

That's definitely something I want to see, and reminiscent of Rust with its compile-time checking. Generalized enough, one can imagine a core runtime that almost doesn't need the GC at all.

What about returning a destructible type? Would it need to be moved explicitly, clearly expressing giving the ownership to the caller:

destructible class Destructible { }

Destructible CreateDestructible()
{
    var value = new Destructible();
    return move value;
}

Or would it be implicit?

Destructible CreateDestructible() {
    var value = new Destructible();
    return value; // ownership automatically transferred to caller
}

If the caller doesn't use the returned value, is it destructed immediately, or only at the end of the current scope?

void SomeLongMethod() {
    CreateDestructible();
    // is the return value destructed already?
    Thread.Sleep(60000);
} // or is it only now, a minute later?
scalablecory commented 9 years ago

@RichiCoder1 from C++ experience, it is not uncommon to have one "owner" object and several other objects that still need to use the instance somehow. I can see the same need here.

Here's an exercise. Say FileStream was made destructible. I want to use it like such:

FileStream fs = new FileStream(...);

await WriteA(fs);
await WriteB(fs);

I still want to allow these other methods (and any objects they allocate in their use of it) to use it while maintaining strict lifetime control at a top level. A ref param won't work here if Write() is an async method that needs to put the FileStream into its state machine object.

jaredpar commented 9 years ago

@ryancerium

Would that allow any object to be destructible then?

No. One of the goals of this proposal is to have destruction be deterministic and getting that requires the implementation of the type obey certain restrictions. Hence it wouldn't be possible to make any type destructible simply by changing its construction syntax.

Mr-Byte commented 9 years ago

:+1:

This is one I've thought about several times, as having deterministic management of resources would be extremely useful in a lot of areas, such as scientific programming and game development.

Could this potentially be used to allow for deterministic management of heap allocated memory? Effectively implementing the C++ std::unique_ptr in C# for more deterministic memory management.

Can these types be stored in collections? I noticed the proposal requires that an object containing fields that are destructible must itself be destructible? How would that work with arrays and collections?

ryancerium commented 9 years ago

@jaredpar The new construction syntax idea was simply so that you could tell how objects are allocated at the point of allocation.

A a = new A(); // Destructible?
B b(); // Destructible!

C# has done a very good job of making things explicit at the call site and not just at the declaration site; ref and out parameters in particular come to mind. (That has long been a problem in C++ to the point that the recommendation is that out parameters be raw pointers and in parameters be references. Raw pointers! What is this, 1986?)

What are the other "certain restrictions"? Why couldn't the CLR initialize any given object type on stack memory instead of heap allocated memory and have the finalizer run when it goes out of scope if it hasn't been moved? There's lambda capture to worry about I suppose, and concurrent modification, neither of which is anything remotely resembling easy.

jaredpar commented 9 years ago

@ryancerium

I agree that being able to distinguish between the two types here is important. In the past we did consider taking a page out of the F# book here and doing the following:

use var A a = new A();  // destructible

The overall feeling when we did this was mixed. Sure it made the construction more obvious but there was also concern it was adding too much verbosity to the code. Eventually we ripped it out in favor of finding a better solution later on.

Why couldn't the CLR initialize any given object type on stack memory instead of heap allocated memory and have the finalizer run when it goes out of scope if it hasn't been moved?

It's not just moving that is a problem but even simple aliasing. If the implementation of a method should put this into the heap somewhere then it would be possible to have a destructed object floating around in what appeared to be a non-destructed state.

ryancerium commented 9 years ago

@jaredpar

it would be possible to have a destructed object floating around in what appeared to be a non-destructed state.

Good point, I take it for granted sometimes that C++ let's people do stupid things if they really feel like it. I assume you meant the implementation of a destructible object's method, so if you put this on the heap, wouldn't that invoke move semantics? @scalablecory has an excellent use case for using references to a destructible object, so it's going to be tricky not matter what.

I really like your sample syntax also, with a minor tweak. If that caused mixed feelings, I don't know what to tell you :-)

use a = new A(); // destructible shorthand, implicit var
use A a = new A(); // destructible longhand, explicit type
RichiCoder1 commented 9 years ago

Maybe somehow allow destructible objects to be boxed and passed like normal somehow? Though that seems like it would have it's own flaws. And lists and arrays bring up their own problem. Especially if you wanted something like a destructible array.

jaredpar commented 9 years ago

@RichiCoder1

Boxing is definitely an option we explored and feel is necessary for completeness. We even called it simply Box<T>. The basic summary of it was:

RichiCoder1 commented 9 years ago

@jaredpar Sounds great an exactly what I was thinking about. I think a read a previous article about discussion around a destructible types internal. Maybe piggy back on other ideas in this thread and follow Nullable<T> and have OutputMessageOnDestruction! be shorthand for Box<T>. Something like.

File! file = File.GetFile("....."); // <--- Implicitly is boxed up.
files.Add(file);

Random thought; Possibly include generic constraint T : destructible so that you could do something like:

public destructible class DestructibleList<T> : IList<T> where T : destructible 
{
    // .. implementation
}
public void ProcesFiles(string folderName)
{
    DirectoryInfo di = new DirectoryInfo(folderName);
    use DestructibleList<File> files = di.GetFiles("*");
    foreach(File! file in files)
    {
         // ... do stuff
    }
    // ...files go away here.
}

Though that raises the next question is how would a list of destructible work with something like IEnumerable?

Addendum: How would destructible types play with async?

tomasr commented 9 years ago

Overall, I like the proposal, but a couple of things are not entirely clear to me.... would someone mind expanding on this:

ufcpp commented 9 years ago

A scope-based single-owned destruction is good idea in a certain situation, I like it, but there might be still many situations that the proposal can not solve. I need the IDisposable for unsubscribing events more often than for managing sensitive resources. This kind of IDisposable could not be destructible, because typically it is nether in a method scope nor owned by a destructible object.

sharwell commented 9 years ago

I was originally concerned that destructable types could require VM changes which would have a negative impact on the performance of the garbage collector. I no longer believe that is the case. My current understanding of the proposal is the following (with the exception of lambda considerations):

The above rules could be expanded to support a destructable class:

The above is surely incomplete but may serve as a starting point for defining the semantics of destructable types.

sharwell commented 9 years ago

Regarding Box<T> - I could see a reference type Box<T> existing for the purpose of using a destructable type in a context where destructable types are generally not allowed, e.g. as a field in a non-destructable type.

destructable struct D { }

class T : IDisposable
{
  D _value1; // error
  Box<D> _value2; // allowed
  Box<T> _value3; // error - T is not destructable
}

When a destructable value is placed into a "box", ownership is assigned to that box. Box<T> implements IDisposable, but more importantly it has a user-defined finalizer. Use of Box<T> would be generally discouraged, but use of destructable types in general would be hindered if Box<T> was not provided, because all currently existing code uses non-destructable types and therefore could not hold instances of these new types.

MrJul commented 9 years ago

@sharwell I assumed by reading the proposal that a destructible type doesn't implement IDisposable. The nice thing about destructible types is that only the runtime can destruct an instance (when nothing is using it anymore), and it guaranties that it happens only once. IDisposable doesn't have those guarantees.

svick commented 9 years ago

@sharwell

A method can have a parameter ref D obj, but it cannot have D obj.

Why not? @stephentoub's proposal contains this:

var omod1 = new OutputMessageOnDestruction("Destructed!");
SomeMethod(move omod1); // Ok, 'omod1' now uninitializedl; won't be destructed

I think this is reasonable, i.e. if the parameter is not ref, it requires explicit transfer of ownership.

D cannot be used as a generic type argument (like a raw pointer in this respect).

So, collections of destructible types wouldn't be allowed (at least not without Box)? I think having those would be very useful.

RichiCoder1 commented 9 years ago

@sharwell what @MrJul said. The implementation looks like something completely seperated from IDisposable. This would be more than than just syntactic sugar, but a completely new behavior.

I also agree with @svick in his assessment of your points.

sharwell commented 9 years ago

Use of destructable types in an array poses and interesting, but not insurmountable, challenge. Consider the following:

destructable struct D { }

...

D[] values = new D[10];

In this scenario, values is (semantically) treated as a destructable class (see previous post) containing 10 sequential instances of D. The implementation could actually emit this in metadata using the type D[], provided instances of this type are only used in contexts where destructable types are allowed. Perhaps a value type destructable struct DestructableArray<T> where T : destructable could exist in System.Runtime.CompilerServices to assist in the use of these types with the using construct. :thought_balloon:

sharwell commented 9 years ago

@MrJul and @RichiCoder1: The implementation of IDisposable does not need to be exposed to the user. It is an implementation detail allowing destructable types in C# to be "lowered" to set of features provided by the underlying VM without requiring new runtime support. While IDisposable could be replaced by another equivalent type, simply using IDisposable allows the feature to be easily defined in terms of the existing using statement. Remember that if you can't box an instance of a destructable type, then you also cannot cast an instance of a destructable type to IDisposable.

sharwell commented 9 years ago

I think this is reasonable, i.e. if the parameter is not ref, it requires explicit transfer of ownership.

I agree, I updated my post above.

sharwell commented 9 years ago

So, collections of destructible types wouldn't be allowed (at least not without Box<T>)? I think having those would be very useful.

Right now I don't see a way to provide destructable guarantees for existing generic data types. Perhaps if we introduce a destructable generic type constraint, then a destructable type could be used for this parameter (or rather, must be used). Then you could require that use of the destructable type follows the semantic requirements for destructable types elsewhere.

jaredpar commented 9 years ago

@tomasr

The generated code for a destructible type would not have an IDisposable implementation or a Finalizer. These features exist to support non-deterministic destruction and add extra overhead to the runtime (in particular the finalizer). A destructible type would be deterministically destructed and hence these are not needed.

RichiCoder1 commented 9 years ago

@sharwell The point would to match C++ behavior here, if I'm understanding correctly. Where lifetime is deterministic, rather than using using. Essentially unique_ptr<T> but built into the language, with Box<T> essential being shared_ptr<T>.

And in regards to collections, I mentioned above adding a destructable generic type constraint, and destructible collections.

Edit: what @jaredpar said :)

sharwell commented 9 years ago

The generated code for a destructible type would not have an IDisposable implementation...

Why would it not? Implementing IDisposable would make it easier to implement the Box<T> type (which would appear in metadata with the where T : IDisposable constraint). It would also make lowering generic types with the destructable generic type constraint, because it could be easily represented as IDisposable in metadata.

jaredpar commented 9 years ago

@sharwell

A destructable type is a struct which implicitly implements IDisposable. For the points that follow say this is struct D.

No. IDisposable exists to support non-determistic destruction. A destructible type is always deterministically destructed and hence has no need of this interface.

D cannot be used as a generic type argument (like a raw pointer in this respect).

Correct. But Box<D> can always be used as a generic type argument.

default(D).Dispose() is a NOP.

No, this must run the destructor.

When we first implemented this feature we tried to add this exact behavior. Unfortunately it's just not really possible without some prohibitive changes to the runtime. The core problem is that it's impossible to tell the difference between:

(new D())
default(D)

If D has a side effect in the destructor the developer would sure expect it to run in the first case. Because we can't distinguish between the two, especially when you get to structs in other places besides locals, we ended up default to always running the dtor.

A destructable class is syntactic sugar for creating a destructable struct which wraps an instance of a reference type which implements IDisposable.

No IDisposable on classes either.

sharwell commented 9 years ago

The point would to match C++ behavior here, if I'm understanding correctly. Where block scope determines the lifetime, rather than using.

ECMA-335 does not provide for block-scoped destruction of data types. Lowering the language concept to the existing runtime requires defining the feature in terms of functionality it does provide. This means it could either be defined using try { ... } finally { ... }, or you could simply reuse the existing using construct as an intermediate step because it's already defined this way.

The using statement would not actually appear in code, but it is a handy way of explaining the semantics of destructable types in terms that C# developers already understand.

sharwell commented 9 years ago

No. IDisposable exists to support non-determistic destruction.

IDisposable supports non-deterministic destruction, but that is not its sole purpose. Otherwise the using statement would not have been written in terms of IDisposable. Also, IDisposable is used by the C++ compiler in support of deterministic destruction.

object.Finalize(), on the other hand, exists solely for the purpose of non-deterministic cleanup.

tomasr commented 9 years ago

@jaredpar Sounds reasonable.

I'm not a big fan of reusing the destructor syntax to define the cleanup code, however, given it already has other, pretty well misunderstood behavior (and by that I mean developers implementing finalizers when they are not needed).

It ends up making the process more complex, and as someone who spends a lot of time reading other people's code, I hate having to look at one piece of a class, and then possibly having to scroll tens or hundreds of line to look for a single word somewhere else that modifies the behavior of the code I'm seeing.

Just my $0.0000002 :)

sharwell commented 9 years ago

default(D).Dispose() is a NOP.

No, this must run the destructor.

Interesting use case. How would you define the move operator for destructable value types, if even default instances of the value type must be destructed? Perhaps by always using them as Nullable<D> instead of just D?

jaredpar commented 9 years ago

@sharwell

We ended up writing the language rules such that it could optimize away calling the destructor in cases like return move. Essentially the rule said:

The compiler can elide destructor calls for locations which were the target of a move and not assigned to afterwards.

sharwell commented 9 years ago

@tomasr I agree with you here. One of the first things I look for in a new project is finalizers, because they are easy to search for and almost always incorrectly used. I suppose I could create a new analyzer to find them (and distinguish them from the ones in destructable types), but still...

jaredpar commented 9 years ago

@sharwell sure IDisposable can work both ways. Destructible types though do not, they are only ever deterministically destructed. Why implement an interface which suggests it can be used two ways when it can only ever be correctly used one way?

RichiCoder1 commented 9 years ago

It ends up making the process more complex, and as someone who spends a lot of time reading other people's code, I hate having to look at one piece of a class, and then possibly having to scroll tens or hundreds of line to look for a single word somewhere else that modifies the behavior of the code I'm seeing.

This would be a situation where you'd want to have some sort of stylecop ensuring that the destructor is only being used as a destructor, or that it's being using paired with IDisposable. Maybe add a compiler warning against using a finalizer without the destructible keyword or IDisposable.

sharwell commented 9 years ago

@jaredpar If we're talking deterministic cleanup of uniquely owned resources, then to the maximum extent supported by the runtime environment I would expect to see exactly one call to the finalizer for each created instance of the type. It seems the language quoted there would not meet my expectations for the following code:

destructable struct D { }
destructable struct D2
{
  D _value;
  static void Method(D d) { }
  void Things()
  {
    Method(move _value);
  }
}

...

{
  // this block creates one instance of D which is destructed twice
  D2 instance;
  instance.Things();
}
jaredpar commented 9 years ago

@sharwell

// this block creates one instance of D which is destructed twice

The language doesn't agree with this comment. The move command definitely assigns a new value to the location after reading the original one. This value is a valid instance of D and follows all other language rules.

I agree the nature of default instances of struct values being destructed is unnerving at first. This is an issue we discussed at length and what we came to is the following:

tomasr commented 9 years ago

@RichiCoder1 Having analyzers that do that is great, but probably doesn't fit my needs.

I am often in the position that I am reviewing customer code where I often won't even have a source code that builds (or may be incomplete, or worse things). In many cases like that, tools don't work. And even if the tools work, I am still going to spend hours looking at the code, and often that code is going to be very, very ugly, long, and having to scroll up and down sucks :)

RichiCoder1 commented 9 years ago

@tomasr fair enough. The biggest issue might simply be that much of the mistakes in using the finalizer come from trying to use it like destructors in normal languages. It'd break back compat, and therefore probably never happen, but I'd be much more a fan of changing the finalizer to a different syntax than forcing destructor to use something besides ~MyClass().

sharwell commented 9 years ago

The language doesn't agree with this comment. The move command definitely assigns a new value to the location after reading the original one. This value is a valid instance of D and follows all other language rules.

I could buy this. However, it suddenly becomes painfully obvious that C# doesn't have constructor initializer lists. How would you initialize the value of a destructible field (of a destructible type) without invoking the destructor of a default instance?

AlgorithmsAreCool commented 9 years ago

@sharwell That is an interesting case. Looking at your previous code example, after move is called on _value, it is definitely assigned a new default value. This new default value would then be destructed with the rest of D2 at some point later. By this logic if a object has a struct field that is destructible, it must be destructed whether it was used or not. This introduces some small but unavoidable overhead doesn't it?

We can avoid this overhead for locals through liveness analysis but I don't think it's avoidable for fields.

As a small side question point, calling move on a struct's field would mutate the struct, right?

MgSam commented 9 years ago

I think deterministically managing memory (and other resources) would be a very useful addition to C#/.NET as it is by far the biggest knock against the language and runtime. However, I feel like this solution isn't widely applicable enough (as it would require extensive re-writing of existing code) and the move syntax also doesn't feel like C#.

Apple managed to shoe-horn ARC into Objective C without requiring changes in the fundamental syntax of assignment. I'd like to see the C# language team work with the CLR team to come up with a similarly broad proposal that doesn't require such fundamental shifts in the way C# is written.

It's definitely time for an alternative to garbage collection but I'm not sure this is it.

jaredpar commented 9 years ago

@MgSam I don't think ARC is applicable here because it is solving a very different problem:

True there are intersections between the features but they also have different uses and different trade offs.

Mr-Byte commented 9 years ago

@jaredpar Would it ever be possible to use destructibles to implement ARC? Or would destructibles be limited to managing other resources?

MgSam commented 9 years ago

@jaredpar The Background section frames the problem broadly as the non-deterministic nature of memory reclaimation in C#. I'm not saying ARC is the right answer just that whatever the right answer is, it should have a few characteristics that I think are lacking in the current proposal:

If the idea is that this proposal is only intended as a nicer alternative to the using statement, I think its missing the forest for the trees. The garbage collector is the biggest problem with .NET, period. If you look at many of the current feature proposals for C# 7.0, a lot of them are related to improving performance by directly or indirectly reducing allocations and thus collection pressure. I'd much rather see garbage collection become optional or deprecated rather than mutate the language in all these other ways in an attempt to avoid allocations.

I can't believe that this is too difficult a problem or not worth the cost- Apple did it with a 25 year old language and they had developers singing their praises for doing so. Both the CLR and C# compiler are open source now and you guys are supposed to be One Microsoft- goals should be able to align to come up with a better solution here.

jaredpar commented 9 years ago

@Mr-Byte I definitely think there is a ref counting solution to be built on top of destructible types but i don't think it should be the primary mechanism.

@MgSam while I share many of your concerns about reducing the impact of the garbage collector I don't see how it's relevant to this issue. This is a language about deterministic reclamation of unmanaged resources. It involves no CLR work and is purely a language implementation.

Anything like ARC, especially if the goal is to remove the garbage collector, is likely to involve CLR work and is a very different feature. Could this be used to implement destructible like types? But this is a much larger effort and one that I think has less of a chance of making it in. Although it's a topic I'm more than happy to discuss, have a lot of interest in it.

casperOne commented 9 years ago

In this case, the value needs to be moved into SomeMethod so that SomeMethod can take ownership of the destruction. If you want to be able to write a helper method that works with a destructible value but that doesn't assume ownership for the destruction, the value can be passed by reference:

void SomeMethod(ref OutputMessageOnDestruction omod2)
{
   ...
} // 'omod2' not destructed here
…
OutputMessageOnDestruction omod1 = new OutputMessageOnDestruction("Destructed!");
SomeMethod(ref omod1); // Ok, 'omod1' still valid

It looks like there's some behavior that's not accounted for when using the ref keyword to not pass ownership of the destructable. Consider this:

void SomeMethod(ref OutputMessageOnDestruction omod2)
{
   omod2 = new OutputMessageOnDestruction("Destructed!");
}
…
OutputMessageOnDestruction omod1 = new OutputMessageOnDestruction("Destructed!");
SomeMethod(ref omod1); // Ok, 'omod1' still valid

What happens to the reference pointed to by omod1? I can imagine the compiler throwing an error when this happens, but then there's the strange semantics of ref (which implies mutability) not being able to be applied (and even worse, bisecting the meaning depending on something outside of the method which isn't immediately apparent).

AlgorithmsAreCool commented 9 years ago

@casperOne That is a great point about the overloading of the ref keyword creating strange semantics. Assigning to omod2 would under normal circumstances clobber the original value out of scope, but with destructible types ref indicates a borrow operation forbidding the recipient scope from obtaining true ownership and destructing the object. So if that assignment is allowed, where would the original value of omod1 be destroyed. And what of the new value? Does it belong to the inner scope or the outer scope?

If the compiler is going to produce an error on assignment to omod2 as @casperOne implies, then what if a destructible is passed into a method like this :

void Foo<T>(ref T stuff) where T : new()
{
    stuff = new T();
}

what does ref mean here?