Open DaZombieKiller opened 1 year ago
Tagging subscribers to this area: @dotnet/area-system-memory See info in area-owners.md if you want to be subscribed.
Author: | DaZombieKiller |
---|---|
Assignees: | - |
Labels: | `api-suggestion`, `area-System.Memory`, `untriaged` |
Milestone: | - |
Tagging subscribers to this area: @dotnet/area-system-runtime See info in area-owners.md if you want to be subscribed.
Author: | DaZombieKiller |
---|---|
Assignees: | - |
Labels: | `api-suggestion`, `area-System.Runtime`, `untriaged` |
Milestone: | - |
CC. @dotnet/gc since the proposal is for System.GC
"GC.RegisterObjectAssociatedMemory would allow the SafeBuffer to be associated with the memory it owns, so that a ref to that region of memory will keep it alive similarly to the behavior of a ref pointing to managed memory. It may also allow some of the safety issues associated with MemoryManager
-backed Memory instances to be solved gracefully."
That's something that I personally find extremely interesting about this proposal. We've discussed at length before (in the #allow-unsafe-blocks channel in the C# Discord) how Memory<T>
is fundamentally an unsafe-equivalent type which due to its unfortunate API design makes it so that it's deceivingly safe for the consumer, as the second you get a value backed by a MemoryManager<T>
with a finalizer, things are not great. So much so that the docs outright say not to have a finalizable memory manager, with the guidance being "better to leak memory than to AV". Which, like, maybe, but neither is great.
With an API like this, we could finally make the following pattern safe:
void ConsumeMemory(Memory<byte> memory)
{
Span<byte> span = memory.Span;
ConsumeSpan(span);
}
This is currently not safe in the case of a finalizable memory manager, because the GC might collect the memory and the memory manager after the user has retrieved a span, which would cause the native memory the span is working on to be swept from under its feet, and cause an AV. You could instead have a finalizable MemoryManager<T>
use RegisterObjectAssociatedMemory
so that the GC can keep it alive even if the code is only using a span that's pointing to that native buffer being returned from it. This would essentially allow us to potentially remove the "don't have a finalizable memory manager" restriction entirely.
Seems pretty good on paper? 😄
Disclaimer: I have no idea how feasible and/or complicated this would be to implement for the GC.
Related to the above, this could also be used by WinRT wrappers for types such as IBuffer
to provide safe APIs to get a Span<byte>
to their underlying storage which would also ensure the RCW is kept alive as long as they're used 👀
That is, you could have a safe IBuffer.AsSpan()
extension (eg. in CsWinRT) that after getting the IBufferByteAccess
interface called RegisterObjectAssociatedMemory
on the RCW before returning the span, making that much safer than today 🙂
GC.RegisterObjectAssociatedMemory would allow the SafeBuffer to be associated with the memory it owns, so that a ref to that region of memory will keep it alive similarly to the behavior of a ref pointing to managed memory
SafeBuffer
has ref-counted lifetime. What is going to prevent some other thread calling Release
on the SafeBuffer and releasing the memory while the current thread is operating on the buffer?
It seems that the proposed pattern would be safe only if the lifetime of the buffer is exclusively managed by finalizers (ie the only place that can release the buffer is the object finalizer, the buffer can never be released explicitly e.g. by Dispose pattern). Managing lifetimes by depending on finalizers exclusively is very high-overhead pattern. It is only acceptable for expensive entities, for example it is used to manage lifetimes of DynamicMethod.
For buffers specifically, allocating managed regular arrays and depending on GC collecting it is cheaper than allocating unmanaged memory and depending on finalizers to release it.
SafeBuffer has ref-counted lifetime. What is going to prevent some other thread calling Release on the SafeBuffer and releasing the memory while the current thread is operating on the buffer?
This is a good point, with no way to prematurely unregister the memory, you could end up in a scenario where you have a ref
to freed memory that is still keeping the object alive (which might now be unrelated, if that memory has been reclaimed by another allocation).
If RegisterObjectAssociatedMemory
returned some kind of weak GC handle-like type, that would mitigate the issue:
internal sealed unsafe class SafeNativeMemoryBuffer : SafeBuffer
{
+ private ObjectAssociatedMemoryHandle _memoryHandle;
public SafeNativeMemoryBuffer(int length)
: base(ownsHandle: true)
{
Initialize((uint)length);
SetHandle((nint)NativeMemory.Alloc((uint)length));
- GC.RegisterObjectAssociatedMemory(this, (void*)handle, (uint)length);
+ _memoryHandle = GC.RegisterObjectAssociatedMemory(this, (void*)handle, (uint)length);
}
public Span<byte> AsSpan()
{
var span = new Span<byte>((void*)handle, (int)(uint)ByteLength);
// Ensure we don't get collected before the span is created
GC.KeepAlive(this);
return span;
}
protected override bool ReleaseHandle()
{
+ _memoryHandle.Dispose();
NativeMemory.Free((void*)handle);
return true;
}
}
Wouldn't that add even more complexity to an API that would still leave your type to not completely be "safe" still? 🥲
Wouldn't that add even more complexity to an API that would still leave your type to not completely be "safe" still? 🥲
If the consumer is able to manually free the memory I'm not sure if it would be possible to make it completely safe (since you could always have a dangling reference), though it would still prevent premature collection (and therefore also prevent premature frees). It'd be an overall improvement in safety, though depending on the complexity of the feature's implementation it may not be worth it.
Right. Yeah I can see how that'd be a "net improvement in safety" here though not perfect. To me it seems like the API would be nice to have mostly for cases like Jan mentioned where you already have an object whose lifetime is 100% determined by the finalizer (eg. in my example, an IBuffer
RCW), though I guess that's also a bit of a niche scenario and probably wouldn't justify the cost given in those cases you can still manually just add GC.KeepAlive
.
The API would be "nice to have" but I get the feeling it'd be way too complicated and expensive to have than it's worth 😄
RegisterObjectAssociatedMemory returned some kind of weak GC handle-like type, that would mitigate the issue:
I do not see how this would help. What would __memoryHandle.Dispose
do when there is active Span on some other thread? It is not really possible for the thread that is calling __memoryHandle.Dispose
to efficiently synchronize with the thread that has active Span.
FWIW, the only known way to solve the unmanaged (non-GC collected) memory lifetime issues is teaching the language about lifetimes, ala Rust borrow checker.
_memoryHandle.Dispose
would only remove the association between the memory and the object, so it would only prevent the object from being kept alive after the memory has been freed. It wouldn't prevent the span from being invalidated (solving that would require something akin to lifetimes as you mentioned). This proposal is moreso about preventing a finalizer from prematurely freeing the memory rather than accounting for a manual Dispose
.
Background and motivation
When working with buffers today, it can be difficult to abstract away unmanaged memory due to the inherent complexity in managing its lifetime.
For example, consider the simplified API surface of the following type:
When
length
is greater thanArray.MaxLength
andIsReferenceOrContainsReferences<T>
isfalse
, the constructor will allocate some native memory using aSafeBuffer
. Otherwise, aT[]
will be allocated.There is currently no way for
ref T this[nuint index]
orAsSpan
to be implemented safely for theSafeBuffer
case, as theref
s will not keep the buffer alive. Without a manualGC.KeepAlive
on theLargeArray<T>
object, the memory can be freed while it is still in use.GC.RegisterObjectAssociatedMemory
would allow theSafeBuffer
to be associated with the memory it owns, so that aref
to that region of memory will keep it alive similarly to the behavior of aref
pointing to managed memory. It may also allow some of the safety issues associated withMemoryManager<T>
-backedMemory<T>
instances to be solved gracefully.API Proposal
API Usage
Alternative Designs
It may make more sense to expose the method on
RuntimeHelpers
rather thanGC
:Risks
Depending on the work required to implement this feature, it could represent a significant complication in the GC's reference tracking implementation that outweighs the benefits offered by it.