API proposal: ReferenceCountedDisposable<T>

sharwell commented 6 years ago

Summary

This proposal simplifies the use of disposable resources shared through code where single owner of the resource is either unclear, or complicates maintenance of the code. This situation is increasingly common when asynchronous code which needs to operate on a threading-agnostic API where the Dispose operation is explicitly called out as not safe for concurrent use.

The semantics of a safe, shared wrapper for IDisposable objects is challenging, especially when considerations are made for handling missing and/or multiple calls to Dispose and for weakly-held references. Providing a holder for managing lifetimes of these objects allows developers to focus on the semantics of the underlying shared object, which is always a challenge in itself.

Proposed API

namespace System.Memory
{
  public sealed class ReferenceCountedDisposable<T> : IDisposable
    where T : class, IDisposable
  {
    public ReferenceCountedDisposable(T instance);

    public T Target { get; }

    public ReferenceCountedDisposable<T> TryAddReference();
    public void Dispose();

    public struct WeakReference
    {
      public WeakReference(ReferenceCountedDisposable<T> reference);

      public ReferenceCountedDisposable<T> TryAddReference();
    }
  }
}

Semantics

A reference-counting wrapper which allows multiple uses of a single disposable object in code, which is deterministically released (by calling IDisposable.Dispose) when the last reference is disposed.

Each instance of ReferenceCountedDisposable<T> represents a counted reference (also referred to as a reference in the following documentation) to a target object. Each of these references has a lifetime, starting when it is constructed and continuing through its release. During this time, the reference is considered alive. Each reference which is alive owns exactly one reference to the target object, ensuring that it will not be disposed while still in use. A reference is released through either of the following actions:

The reference is explicitly released by a call to Dispose.
The reference is no longer in use by managed code and gets reclaimed by the garbage collector.

While each instance of ReferenceCountedDisposable<T> should be explicitly disposed when the object is no longer needed by the code owning the reference, this implementation will not leak resources in the event one or more callers fail to do so. When all references to an object are explicitly released (i.e. by calling Dispose), the target object will itself be deterministically released by a call to IDisposable.Dispose when the last reference to it is released. However, in the event one or more references is not explicitly released, the underlying object will still become eligible for non-deterministic release (i.e. finalization) as soon as each reference to it is released by one of the two actions described previously.

When using ReferenceCountedDisposable<T>, certain steps must be taken to ensure the target object is not disposed early.

Use ReferenceCountedDisposable<T> consistently. In other words, do not mix code using reference-counted wrappers with code that references to the target directly.
Only use the ReferenceCountedDisposable<T>(T reference) constructor one time per target object. Additional references to the same target object must only be obtained by calling TryAddReference.
Do not call IDisposable.Dispose on the target object directly. It will be called automatically at the appropriate time, as described above.

All public methods on this type adhere to their pre- and post-conditions and will not invalidate state even in concurrent execution.

`ReferenceCountedDisposable<T>.TryAddReference`

Increments the reference count for the disposable object, and returns a new disposable reference to it. The returned object is an independent reference to the same underlying object. Disposing of the returned value multiple times will only cause the reference count to be decreased once.

Return value: a new ReferenceCountedDisposable<T> pointing to the same underlying object, if it has not yet been disposed; otherwise, null if this reference to the underlying object has already been disposed.

`ReferenceCountedDisposable<T>.WeakReference`

Represents a weak reference to a ReferenceCountedDisposable<T> which is capable of obtaining a new counted reference up until the point when the object is no longer accessible.

Differences between `TryAddReference` operations

The semantics of ReferenceCountedDisposable<T>.TryAddReference and ReferenceCountedDisposable<T>.WeakReference.TryAddReference are slightly different:

ReferenceCountedDisposable<T>.TryAddReference: This method returns null after this reference is disposed. In other words, it is possible for other references to the target object to still be held in code.
ReferenceCountedDisposable<T>.WeakReference.TryAddReference: This method returns null after the last reference to the target object is disposed.

sharwell commented 6 years ago

Example: Sharing memory mapped files

Roslyn uses memory mapped files as a temporary data store to move infrequently-used data outside of the process. We found that closing memory mapped files is not instant, and holding a few thousand small instances had a noticeable impact on application shutdown performance. To improve overall performance, we moved from using individual files for units of data to using larger files capable of holding many pieces of data.

During the transition, we moved from each data referencing its own memory mapped file to many data pieces referencing shared files. However, from the perspective of each individual data point it is beneficial to reason about the handle as still being owned by itself.

Implementing this solution resulted in the following:

The shared service that manages memory mapped files holds a weak reference (ReferenceCountedDisposable<T>.WeakReference) to the most recently used memory mapped file which contains free storage space.
Data units each hold a ReferenceCountedDisposable<T> to the memory mapped file.
As long as one or more data units is still alive and using this file for storage, the service can use the weak reference to access the file (which is guaranteed to still be open) and allocate additional storage within it.
When the last data unit is disposed, the memory mapped file automatically and deterministically closes. However, each data unit continues to manage its own lifetime.

The change resulted in a reduction from several tens of thousands of memory mapped files to at most a few hundred. Local reasoning in the implementation was largely unchanged (a good thing), and the application as a whole used fewer resources (dotnet/roslyn#20439) and substantially improved shutdown performance (dotnet/roslyn#19493).

jnm2 commented 6 years ago

If someone passes me a ReferenceCountedDisposable<Foo>, how do I know when to access Target directly versus calling TryAddReference first? Do I do the latter only if I'm letting it escape the stack?

sharwell commented 6 years ago

@jnm2 In our usage patterns, if you get an instance, you own that instance.

jnm2 commented 6 years ago

@sharwell Cool. Found:

TryAddReference usages

WeakReference TryAddReference usages

StephenCleary commented 2 years ago

Only use the ReferenceCountedDisposable(T reference) constructor one time per target object.

If the implementation uses ConditionalWeakTable, this restriction could be removed. Not sure if that would be worth the overhead, though.

sharwell commented 2 years ago

@StephenCleary that's an interesting point. In the time since this was filed we've seen a few other ways this pattern could be optimized as well. The main one would be allowing a type to own its own reference count field, which would remove the need to store that value in a separate location. Or, if the pattern is supported directly by the runtime, it could be allocated as a hidden field at the end of the object's location in memory (similar to how std::make_shared works).

StephenCleary commented 2 years ago

@sharwell I'm planning to add a ref counted disposable to Nito.Disposables shortly; I'd love to hear any ideas or lessons learned. This one includes a weak ref that I haven't thought of before. Am I right in assuming it's not actually a weak reference, just a reference without a count? I.e., the target may be disposed but can't be GCed as long as a weak ref exists?

sharwell commented 2 years ago

Am I right in assuming it's not actually a weak reference, just a reference without a count? I.e., the target may be disposed but can't be GCed as long as a weak ref exists?

The weak reference allows a new reference-counted strong reference to be obtained up until the point the object is disposed. If you call TryAddReference on the weak reference and it returns true, you know the target has not been disposed and will not be disposed prior to releasing the added reference.

I do sometimes wish there was a Lease() method that added a reference without allocating for use in using statements. This gives up some of the correctness assurances of creating new class instances, but for specific use in a using statement it doesn't matter. Note that it can't be a ref struct because use in asynchronous methods is a primary use case.

rickbrew commented 2 years ago

I have a similar system in Paint.NET. It goes much farther than what is described here, and is primarily used for COM interop. It basically reifies each "ref" (AddRef() / QueryInterface()) as a proxy object, which can then be disposed or GC'd. I call it "reference tracking," and it combines the strengths of garbage collection and reference counting, while eliminating the issues with circular references (albeit, at the cost of additional objects, of course). It's also useful for pure managed classes, and I do have a class, SharedRef<T>, which implements what's described here as a wrapper for IDisposable objects.

What you're describing here is basically std::shared_ptr<T>, and it's very useful. I often need to have buffers that are shared between multiple areas of the app, and it's important that the underlying (native) memory is not freed while still in use. IDisposable does not permit this, as the first Dispose() call will free the memory, and there's no way to annotate an object as "don't dispose this, its ownership is shared" except by removing IDisposable entirely, which is undesirable.

My canonical example here for Paint.NET is when you have an image open, and you add a layer, draw something onto the layer, and then delete the layer. There's a background thread that is rendering a thumbnail for the Layers window, and while it does its best to cancel out early, there's still a generous window of time where it needs that buffer to be valid (after the Layer was deleted by the user and the underlying bitmap was Dispose()d). Reference counting / cooperative disposal is the solution here, and permits deterministic, eager freeing of the buffer when possible, without having to rely on the garbage collector.

dotnet / runtime