dotnet / runtimelab

This repo is for experimentation and exploring new ideas that may or may not make it into the main dotnet/runtime repo.
MIT License
1.36k stars 188 forks source link

Review UnsafeRawPointer projections in C# for CryptoKit dev templates #2529

Closed kotlarmilos closed 3 months ago

kotlarmilos commented 3 months ago

Description

This issue is intended to discuss projections of UnsafeRawPointers in C# and their marshalling, with a focus on evaluating correctness and scalability. The goal is to review and confirm the design of the projections and use them to create CryptoKit dev templates.

Projections of Swift unsafe raw pointers in C#:

// Copyright (c) Microsoft Corporation.
// Licensed under the MIT License.

using System.Runtime.InteropServices;

namespace Swift.Runtime {
    /// <summary>
    /// Represents a Swift value type.
    /// </summary>
    public unsafe interface ISwiftValueType {
        byte* SwiftData { get; set; }
    }

    /// <summary>
    /// Represents a Swift enum.
    /// </summary>
    public interface ISwiftEnum : ISwiftValueType {
    }

    /// <summary>
    /// Represents a Swift struct.
    /// </summary>
    public interface ISwiftStruct : ISwiftValueType {
    }

    // <summary>
    // Represents Swift Unsafe[Mutable]RawPointer in C#.
    // </summary>
    public unsafe struct Unsafe[Mutable]RawPointer : ISwiftStruct
    {
        public byte* SwiftData { get; set; }
        public Unsafe[Mutable]RawPointer(void* p)
        {
            SwiftData = (byte*)p;
        }
    }

    // <summary>
    // Represents Swift Unsafe[Mutable]RawBufferPointer in C#.
    // </summary>
    public unsafe struct Unsafe[Mutable]RawBufferPointer : ISwiftStruct
    {
        public byte* SwiftData { get; set; }
        public Unsafe[Mutable]RawBufferPointer(void* start, int count)
        {
            SwiftData = (byte*)_Initialize(start, count);
        }

        ~Unsafe[Mutable]RawBufferPointer()
        {
            // Release memory
         }

        // <summary>
        // Represents init(start: UnsafeRawPointer?, count: Int)
        // https://developer.apple.com/documentation/swift/unsaferawbufferpointer/init(start:count:)
        // </summary>
        [DllImport("libSwiftCore.dylib", EntryPoint = "$sSR5start5countSRyxGSPyxGSg_Sitcfc")]
        internal static extern void* _Initialize (void* start, int count);
    }
}

Generated C# bindings:

[DllImport("libHelloLibrary.dylib", EntryPoint = "$s12HelloLibrary41AppleCryptoNative_ChaCha20Poly1305Encrypt6keyPtr0J6Length05nonceK00mL009plaintextK00nL016ciphertextBuffer0opL003tagP00qpL003aadK00rL0s5Int32VSv_APSvAPSvAPSvAPSvAPSvAPtF")]
internal static extern Int32 PIfunc_AppleCryptoNative_ChaCha20Poly1305Encrypt(void* keyPtr, Int32 keyLength, void* noncePtr, Int32 nonceLength, void* plaintextPtr, Int32 plaintextLength, void* ciphertextBuffer, Int32 ciphertextBufferLength, void* tagBuffer, Int32 tagBufferLength, void* aadPtr, Int32 aadLength);

public static unsafe Int32 AppleCryptoNative_ChaCha20Poly1305Encrypt(
    UnsafeRawPointer keyPtr, Int32 keyLength,
    UnsafeMutableRawPointer noncePtr, Int32 nonceLength,
    UnsafeMutableRawPointer plaintextPtr, Int32 plaintextLength,
    UnsafeMutableRawPointer ciphertextPtr, Int32 ciphertexLength,
    UnsafeMutableRawPointer tagPtr, Int32 tagLength,
    UnsafeMutableRawPointer aadPtr, Int32 aadLength)
{
    int result;

    fixed (void* pKeyPtr = keyPtr.SwiftData)
    fixed (void* pNoncePtr = noncePtr.SwiftData)
    fixed (void* pPlaintextPtr = plaintextPtr.SwiftData)
    fixed (void* pCiphertextBuffer = ciphertextPtr.SwiftData)
    fixed (void* pTagBuffer = tagPtr.SwiftData)
    fixed (void* pAadPtr = aadPtr.SwiftData)
    {
        result = PIfunc_AppleCryptoNative_ChaCha20Poly1305Encrypt(
            pKeyPtr, keyLength,
            pNoncePtr, nonceLength,
            pPlaintextPtr, plaintextLength,
            pCiphertextBuffer, ciphertextBufferLength,
            pTagBuffer, tagBufferLength,
            pAadPtr, aadLength);
    }

    return result;
}

User's C# code:

var keyBuffer = new UnsafeMutableRawPointer(keyPtr);
var nonceBuffer = new UnsafeMutableRawPointer(noncePtr);
var plaintextBuffer = new UnsafeMutableRawPointer(plaintextPtr);
var ciphertextBuffer = new UnsafeMutableRawPointer(keyPtr);
var tagBuffer = new UnsafeMutableRawPointer(tagPtr);
var aadBuffer = new UnsafeMutableRawPointer(aadPtr);

int result = HelloLibrary.AppleCryptoNative_ChaCha20Poly1305Encrypt(
                    keyBuffer, key.Length,
                    nonceBuffer, nonce.Length,
                    plaintextBuffer, plaintext.Length,
                    ciphertextBuffer, ciphertext.Length,
                    tagBuffer, tag.Length,
                    aadBuffer, aad.Length);

/cc: @stephen-hawley @rolfbjarne @AaronRobinsonMSFT @jkoritzinsky @vitek-karas

jkotas commented 3 months ago

Is this supposed to wrap https://developer.apple.com/documentation/cryptokit/chachapoly ? I do not see the mapping between this cryptokit API and what you have above.

kotlarmilos commented 3 months ago

Yes, but in the next iterations. This should review and implement the existing bindings in https://github.com/dotnet/runtime/blob/main/src/native/libs/System.Security.Cryptography.Native.Apple/pal_swiftbindings.swift.

jkotas commented 3 months ago

I think UnsafeMutableRawPointer on Swift side should map to void* on the C# side. I do not see the point in trying to wrap it in a struct.

struct Unsafe[Mutable]RawBufferPointer
{
    ~Unsafe[Mutable]RawBufferPointer()
    {
        // Release memory
    }
}

This will fail to compile with CS0575: Only class types can contain destructors.

Also, according to https://developer.apple.com/documentation/swift/unsaferawbufferpointer, UnsafeRawBufferPointer does not own the memory that it references, so the C# binding should not be trying to free any memory.

rolfbjarne commented 3 months ago

I think UnsafeMutableRawPointer on Swift side should map to void* on the C# side. I do not see the point in trying to wrap it in a struct.

What if a Swift method has two overloads, one with UnsafeMutableRawPointer, and one with UnsafeRawPointer? Providing two differently named methods in .NET works, except for constructors. Which I guess could be solved with factory methods, but ugh...

More generally, this can happen if we map any two or more Swift types into the same .NET type.

We had this problem in our Objective-C bindings since we bound the Objective-C type id to IntPtr, and NSInteger to nint, which happens to be an alias for IntPtr - and there are constructors that are overloaded in id and NSInteger. In the end we solved this by mapping id to a new .NET struct NativeHandle.

        public byte* SwiftData { get; set; }
        public Unsafe[Mutable]RawPointer(void* p)
        {
            SwiftData = (byte*)p;
        }

why is the ctor taking a void*, but the property type is byte*? Seems like these should be the same type.

Maybe we should also add explicit operators to/from void* for Unsafe[Mutable]RawBufferPointer?

        // <summary>
        // Represents init(start: UnsafeRawPointer?, count: Int)
        // https://developer.apple.com/documentation/swift/unsaferawbufferpointer/init(start:count:)
        // </summary>
        [DllImport("libSwiftCore.dylib", EntryPoint = "$sSR5start5countSRyxGSPyxGSg_Sitcfc")]
        internal static extern void* _Initialize (void* start, int count);

The Swift Int type is 64-bit on 64-bit platforms, so it should be bound as nint count.

kotlarmilos commented 3 months ago

I think UnsafeMutableRawPointer on Swift side should map to void* on the C# side. I do not see the point in trying to wrap it in a struct.

What if a Swift method has two overloads, one with UnsafeMutableRawPointer, and one with UnsafeRawPointer? Providing two differently named methods in .NET works, except for constructors. Which I guess could be solved with factory methods, but ugh...

More generally, this can happen if we map any two or more Swift types into the same .NET type.

We had this problem in our Objective-C bindings since we bound the Objective-C type id to IntPtr, and NSInteger to nint, which happens to be an alias for IntPtr - and there are constructors that are overloaded in id and NSInteger. In the end we solved this by mapping id to a new .NET struct NativeHandle.

I agree that we should have a specific reason for mapping UnsafeMutableRawPointer into a struct. If two or more Swift types are mapped to the same .NET type, they should have the same semantics in .NET, and it shouldn't pose interop issues. The projection tooling needs to implement a strategy for handling such cases, such as different method naming for the same signatures.

This will fail to compile with CS0575: Only class types can contain destructors.

Also, according to https://developer.apple.com/documentation/swift/unsaferawbufferpointer, UnsafeRawBufferPointer does not own the memory that it references, so the C# binding should not be trying to free any memory.

Thanks, you are right! Here are details:

An UnsafeMutableRawBufferPointer instance is a view into memory and does not own the memory that it references.

The Unsafe[Mutable]RawBufferPointer structs require initialization from the Swift side. Here is the updated proposal:

// <summary>
// Represents Swift Unsafe[Mutable]RawBufferPointer in C#.
// </summary>
public unsafe struct Unsafe[Mutable]RawBufferPointer : ISwiftStruct
{
    public void* SwiftData { get; set; }
    public Unsafe[Mutable]RawBufferPointer(void* start, nint count)
    {
        SwiftData = init(start, count);
    }

    // Explicit conversion from Unsafe[Mutable]RawBufferPointer to void*
    public static explicit operator void*(Unsafe[Mutable]RawBufferPointer buffer)
    {
        return buffer.SwiftData;
    }

    // <summary>
    // Represents init(start: UnsafeRawPointer?, count: Int)
    // https://developer.apple.com/documentation/swift/unsaferawbufferpointer/init(start:count:)
    // </summary>
    [DllImport("libSwiftCore.dylib", EntryPoint = "$sSR5start5countSRyxGSPyxGSg_Sitcfc")]
    internal static extern void* init(void* start, nint count);
}
jkotas commented 3 months ago

What if a Swift method has two overloads, one with UnsafeMutableRawPointer, and one with UnsafeRawPointer?

Ok, we can have a thin struct wrapper with default conversion to/from void* to disambiguate overloads like this.

SwiftData = init(start, count);

Is init going to allocate some memory? How is that memory going to be freed? I think Unsafe[Mutable]RawBufferPointer either needs to be a class on a regular Swift projection plan that comes with all overheads (IDisposable/finalizer, etc.); or it needs to be a struct with two fields that is composed/decomposed in the projection layer.

kotlarmilos commented 3 months ago

Is init going to allocate some memory? How is that memory going to be freed?

The init function does not allocate any new memory itself; it creates a buffer and provides a view into the memory referenced by start param. In this scenario, memory allocation is handled by .NET, and GC will handle it. Another way to implement Unsafe[Mutable]RawBufferPointer is to utilize allocate and deallocate where ownership will be handled from the Swift side.

I think Unsafe[Mutable]RawBufferPointer either needs to be a class on a regular Swift projection plan that comes with all overheads (IDisposable/finalizer, etc.)

I agree. If it utilizes init function where .NET allocates memory, the class can implement IDisposable but there's no need for explicit disposal since the memory will be allocated in the outer scope.

jkotas commented 3 months ago

The init function does not allocate any new memory itself

I am not following. This function takes two pointer-sized pieces of information and creates one pointer-sized piece of information from those. How can it do that without allocating new memory?

kotlarmilos commented 3 months ago

I may be mistaken, but here is my understanding of the flow: .NET allocates memory and passes its pointer and count to the Swift Unsafe${Mutable}RawBufferPointer.init. The Unsafe${Mutable}RawBufferPointer is a struct with _position and _end properties, which the constructor initializes as:

_position = start
_end = start.map { $0 + _assumeNonNegative(count) }

https://github.com/apple/swift/blob/ce3f4881586298c32c6a302331cb4e6c2a1cc6cb/stdlib/public/core/UnsafeRawBufferPointer.swift.gyb#L97-L121

The constructor should return an instance of Unsafe${Mutable}RawBufferPointer with _position and _end pointing to the existing memory. Does this make sense? I will try to check it by looking at generated code.

BTW I added support for UnsafeRawPointer according to feedback and corresponding CryptoKit template in https://github.com/dotnet/runtimelab/pull/2525/commits/aa55c63534c8b396068c5ad01dddda6dc0d36504.

jkotas commented 3 months ago

The constructor should return an instance of Unsafe${Mutable}RawBufferPointer with _position and _end pointing to the existing memory. Does this make sense?

Yes, that makes sense.

kotlarmilos commented 3 months ago

Thank you for navigating the discussion. It appears that projecting buffer pointers might be simpler than initially prototyped. Consider the following Swift example:

public func testBuffer(buffer: UnsafeRawBufferPointer) {
    print("testBuffer.baseAddress: \(buffer.baseAddress)");
    print("testBuffer.count: \(buffer.count)");
}

And the corresponding LLVM-IR:

define protected swiftcc void @"$s6output10testBuffer6bufferySW_tF"(i64 %0, i64 %1) #0 !dbg !37 {
entry:
  %buffer.debug = alloca %TSW, align 8
  %buffer.debug._position = getelementptr inbounds %TSW, ptr %buffer.debug, i32 0, i32 0, !dbg !49
  store i64 %0, ptr %buffer.debug._position, align 8, !dbg !49
  %buffer.debug._end = getelementptr inbounds %TSW, ptr %buffer.debug, i32 0, i32 1, !dbg !49
  store i64 %1, ptr %buffer.debug._end, align 8, !dbg !49

The function accepts two i64 arguments: the first argument represents buffer.debug._position, and the second argument represents buffer.debug._end. This indicates that we can implement Unsafe${Mutable}RawBufferPointer as structs and depend on runtime struct lowering for their handling:

// Copyright (c) Microsoft Corporation.
// Licensed under the MIT License.

using System.Runtime.InteropServices;

namespace Swift.Runtime
{
    // <summary>
    // Represents Swift Unsafe[Mutable]RawBufferPointer in C#.
    // </summary>
    public unsafe struct UnsafeRawBufferPointer
    {
        public void* Position { get; set; }
        public void* End { get; set; }
        public UnsafeRawBufferPointer(void* start, nint count)
        {
            Position = start;
            End = (byte*)start + count;
        }

        // Explicit conversion from Unsafe[Mutable]RawBufferPointer to void*
        public static explicit operator void*(UnsafeRawBufferPointer buffer)
        {
            return buffer.Position;
        }
    }
}

I managed to confirm it through manual lowering:

unsafe {
    byte[] key = RandomNumberGenerator.GetBytes(32);
    fixed (void* pKey = key){
        var testBuffer = new UnsafeRawBufferPointer(pKey, key.Length);
        HelloLibrary.EncryptBuffer(testBuffer.Position, testBuffer.End);
    }
}

[DllImport("libHelloLibrary.dylib", EntryPoint = "$s12HelloLibrary13EncryptBuffer04testD05countySW_s5Int32VtF")]
internal static extern void PIfunc_EncryptBuffer(void* position, void* end);
public static void EncryptBuffer(void* position, void* end)
{
    PIfunc_EncryptBuffer(position, end);
}

// Swift output
// testBuffer.baseAddress: Optional(0x000000013800d500)
// testBuffer.count: 32
jkotas commented 3 months ago

Explicit conversion from Unsafe[Mutable]RawBufferPointer to void*

I do not think we want to have explicit conversion operator for this. People should just call the Pointer property to get the buffer pointer.

jkotas commented 3 months ago
    public void* Position { get; set; }
    public void* End { get; set; }

Are types like UnsafeRawBufferPointer mutable in Swift? I would expect them to be immutable that would map to readonly struct in C#.

jkotas commented 3 months ago

public void* Position { get; set; }

This should not be called Position. Swift calls it baseAddress. Should we match that name?

kotlarmilos commented 3 months ago

I recommend using the same private fields as those in the Swift, and surface them as BaseAddress and Count: https://github.com/apple/swift/blob/27443e03abe6aa919f6a697cd44665a339bd75d8/stdlib/public/core/UnsafeBufferPointer.swift.gyb#L120-L124. Please note that "typed" buffer pointers are defined with _Position and Count, while raw buffer pointers are defined with _Position and _End.

Based on the current design and usage expectations, I don't think it is necessary to implement iterator support for "typed" buffers (UnsafeBufferPointer<T>) at this stage. These types are not intended to be used as first-class citizens by users; they are primarily designed for operations at the surface layer, where data is likely to be prepared before.

Feel free to adjust the code snippet if needed.

// Copyright (c) Microsoft Corporation.
// Licensed under the MIT License.

using System.Runtime.InteropServices;

namespace Swift.Runtime
{
    // <summary>
    // Represents Swift UnsafeRawBufferPointer in C#.
    // </summary>
    public unsafe readonly struct UnsafeRawBufferPointer
    {
        private readonly void* _Position;
        private readonly void* _End;
        public UnsafeRawBufferPointer(void* start, nint count)
        {
            _Position = start;
            _End = (byte*)start + count;
        }

        public void* BaseAddress => _Position;
        public nint Count => (nint)((byte*)_End - (byte*)_Position);
    }

    // <summary>
    // Represents Swift UnsafeMutableRawBufferPointer in C#.
    // </summary>
    public unsafe readonly struct UnsafeMutableRawBufferPointer
    {
        private readonly void* _Position;
        private readonly void* _End;
        public UnsafeMutableRawBufferPointer(void* start, nint count)
        {
            _Position = start;
            _End = (byte*)start + count;
        }

        public void* BaseAddress => _Position;
        public nint Count => (nint)((byte*)_End - (byte*)_Position);
    }

    // <summary>
    // Represents Swift UnsafeBufferPointer in C#.
    // </summary>
    public unsafe readonly struct UnsafeBufferPointer
    {
        private readonly void* _Position;
        public readonly nint Count;
        public UnsafeBufferPointer(void* start, nint count)
        {
            _Position = start;
            Count = count;
        }

        public void* BaseAddress => _Position;
    }

    // <summary>
    // Represents Swift UnsafeMutableBufferPointer in C#.
    // </summary>
    public unsafe readonly struct UnsafeMutableBufferPointer
    {
        private readonly void* _Position;
        public readonly nint Count;
        public UnsafeMutableBufferPointer(void* start, nint count)
        {
            _Position = start;
            Count = count;
        }

        public void* BaseAddress => _Position;
    }
}
jkoritzinsky commented 3 months ago

@kotlarmilos can we make all of the buffer types define Count as a property instead of some of them as fields? Other than that, this is looking good!

kotlarmilos commented 3 months ago

Thanks! Count shouldn't be a property of raw buffer pointers because it is not intended to be lowered by the runtime.

jkotas commented 3 months ago

public unsafe readonly struct UnsafeRawBufferPointer public unsafe struct UnsafeMutableRawBufferPointer

I think all these structs should be readonly struct. readonly applies only to the struct itself, not to the memory that it points to. The difference between UnsafeBufferPointer and UnsafeMutableBufferPointer is whether the memory that it points to is mutable, not whether the pointer itself is mutable.

UnsafeBufferPointer UnsafeMutableBufferPointer

These are generic over TElement in Swift. (https://developer.apple.com/documentation/swift/unsafebufferpointer/). How are we going to deal with the genericness of this type?

kotlarmilos commented 3 months ago

public unsafe readonly struct UnsafeRawBufferPointer public unsafe struct UnsafeMutableRawBufferPointer

I think all these structs should be readonly struct. readonly applies only to the struct itself, not to the memory that it points to. The difference between UnsafeBufferPointer and UnsafeMutableBufferPointer is whether the memory that it points to is mutable, not whether the pointer itself is mutable.

Makes sense, updated.

UnsafeBufferPointer UnsafeMutableBufferPointer

These are generic over TElement in Swift. (https://developer.apple.com/documentation/swift/unsafebufferpointer/). How are we going to deal with the genericness of this type?

What are the implications here? Do we need to implement them as generics? Maybe we can reduce the scope and consider adding <TElement> at some later stage (if needed). Given that these types don't own memory, we could delegate the responsibility of ensuring compatibility between .NET and Swift TElement types to the users.

jkotas commented 3 months ago

What are the implications here?

If you map multiple Swift types to a single .NET type, you will run into the potential problems with overloading that were mentioned earlier. Consider a UnsafeBufferPointer<Int8> and UnsafeBufferPointer<Int16> overloads of the same method.

kotlarmilos commented 3 months ago

Thanks, we can implement thin wrappers to handle overload cases.

kotlarmilos commented 3 months ago

Support for UnsafeBufferPointer types with corresponding CryptoKit template has been added in https://github.com/dotnet/runtimelab/pull/2525/commits/9264f83de4b0b76e00b8667022e0e5035dc749d6.

Surfacing https://developer.apple.com/documentation/cryptokit/chachapoly enum and https://developer.apple.com/documentation/cryptokit/symmetrickey struct will be discussed in a subsequent issue. Thank you all for valuable feedback!