[API Proposal]: `UnsafeAccessorTypeAttribute` for static or private type access

AaronRobinsonMSFT commented 1 year ago

Background and motivation

The UnsafeAccessorAttribute mechanism was designed to provide access to a non-visible static or instance member (that is, method or field). There are however limitations with this design when involving static types or accessing members on non-visible types. This attribute helps bridge that gap.

Consider the following two scenarios involving methods. Note that fields suffer from the same issue.

Scenario 1 - Private type

// Assembly A
private class C
{
    private static int Method(int a) { ... }
}

// Assembly B
[UnsafeAccessor(UnsafeAccessorKind.StaticMethod, Name="Method")]
static extern int CallMethod(??? c, int a); // One cannot write type C due to visibility.

Scenario 2 - Static type

// Assembly A
public static class C
{
    private static int Method(int a) { ... }
}

// Assembly B
[UnsafeAccessor(UnsafeAccessorKind.StaticMethod, Name="Method")]
static extern int CallMethod(??? c, int a); // A static class cannot be used as a parameter in a signature.

Scenario 3 - Private class parameters

// Assembly A
public class C
{
    private class D { }
    private static int Method(D d) { ... }
}

// Assembly B
[UnsafeAccessor(UnsafeAccessorKind.StaticMethod, Name="Method")]
static extern int CallMethod(??? d); // Unable express D type as a parameter.

API Proposal

The following attribute would accept a fully qualified or partially qualified type name to use for member look-up. The string supplied via the typeName would be in a format that is accepted by Type.GetType(string typeName). This API will be specifcally limited by what that API is capable of. There may be additional restrictions for the initial implementation of this. For example, it only works on reference types for now.

namespace System.Runtime.CompilerServices;

[AttributeUsage(AttributeTargets.Parameter | AttributeTargets.ReturnValue, AllowMultiple = false, Inherited = false)]
public sealed class UnsafeAccessorTypeAttribute : Attribute
{
    /// <summary>
    /// Instantiates an <see cref="UnsafeAccessorTypeAttribute"/> providing access to a type supplied by <paramref name="typeName"/>.
    /// </summary>
    /// <param name="typeName">A fully qualified or partially qualified type name.</param>
    public UnsafeAccessorTypeAttribute(string typeName)
    {
        TypeName = typeName;
    }

    /// <summary>
    /// Fully qualified or partially qualified type name to target.
    /// </summary>
    public string TypeName { get; }
}

API Usage

Scenario 1 - Private type

// Assembly A
private class C
{
    private static int Method(int a) { ... }
}

// Assembly B
[UnsafeAccessor(UnsafeAccessorKind.StaticMethod, Name="Method")]
static extern int CallMethod([UnsafeAccessorType("C, A, Version=1.0.0.0, Culture=neutral")] object c, int a);

Scenario 2 - Static type

// Assembly A
public static class C
{
    private static int Method(int a) { ... }
}

// Assembly B
[UnsafeAccessor(UnsafeAccessorKind.StaticMethod, Name="Method")]
static extern int CallMethod([UnsafeAccessorType("C, A")] object? c, int a);

Scenario 3 - Private class parameters

// Assembly A
public class C
{
    private class D { }
    private static int Method(D d) { ... }
}

// Assembly B
[UnsafeAccessor(UnsafeAccessorKind.StaticMethod, Name="Method")]
static extern int CallMethod([UnsafeAccessorType("C, A")] object? c, [UnsafeAccessorType("C+D, A")] object d); // Use attribute to look up type

Alternative Designs

Expand the UnsafeAccessorAttribute attribute to have an optional type target field/property. Note This API option wouldn't address scenario (3).

namespace System.Runtime.CompilerServices;

[AttributeUsage(AttributeTargets.Method, AllowMultiple = false, Inherited = false)]
public sealed class UnsafeAccessorAttribute : Attribute
{
+    /// <summary>
+    /// Fully qualified or partially qualified type name to target.
+    /// </summary>
+    public string TargetTypeName { get; set; }
}

Risks

No response

ghost commented 1 year ago

Tagging subscribers to this area: @dotnet/area-system-runtime-compilerservices See info in area-owners.md if you want to be subscribed.

Issue Details

### Background and motivation The [`UnsafeAccessorAttribute`](https://github.com/dotnet/runtime/issues/81741) mechanism was designed to provide access to a non-visible static or instance member (that is, method or field). There are however limitations with this design when involving `static` types or accessing members on non-visible types. This attribute helps bridge that gap. Consider the following two scenarios involving methods. Note that fields suffer from the same issue. ## Scenario 1 - Private type ```csharp // Assembly A private class C { private static int Method(int a) { ... } } ``` ```csharp // Assembly B [UnsafeAccessor("Method")] static extern int CallMethod(??? c, int a); // One cannot write type C due to visibility. ``` ## Scenario 2 - Static type ```csharp // Assembly A public static class C { private static int Method(int a) { ... } } ``` ```csharp // Assembly B [UnsafeAccessor("Method")] static extern int CallMethod(??? c, int a); // A static class cannot be used as a parameter in a signature. ``` ### API Proposal The following attribute would accept a [fully qualified](https://learn.microsoft.com/dotnet/framework/reflection-and-codedom/specifying-fully-qualified-type-names) or partially qualified type name to use for member look-up. This attribute would only be referenced if the `UnsafeAccessorAttribute` kind type was `UnsafeAccessorKind.StaticField` or `UnsafeAccessorKind.StaticMethod`. For any other value of `UnsafeAccessorKind`, this attribute would be ignored. ```csharp namespace System.Runtime.CompilerServices; [AttributeUsage(AttributeTargets.Method, AllowMultiple = false, Inherited = false)] public sealed class UnsafeAccessorTypeAttribute : Attribute { ///

/// Instantiates an providing access to a type supplied by . ///

/// A fully qualified or partially qualified type name. public UnsafeAccessorTypeAttribute(string typeName) { TypeName = typeName; } ///

/// Fully qualified or partially qualified type name to target. ///

public string TypeName { get; init; } } ``` ### API Usage ## Scenario 1 - Private type ```csharp // Assembly A private class C { private static int Method(int a) { ... } } ``` ```csharp // Assembly B [UnsafeAccessor("Method")] [UnsafeAccessorType("C, A, Version=1.0.0.0, Culture=neutral")] // Look up type here as opposed to signature static extern int CallMethod(int a); ``` ## Scenario 2 - Static type ```csharp // Assembly A public static class C { private static int Method(int a) { ... } } ``` ```csharp // Assembly B [UnsafeAccessor("Method")] [UnsafeAccessorType("C, A")] // Look up type here as opposed to signature static extern int CallMethod(int a); ``` ### Alternative Designs Expand the `UnsafeAccessorAttribute` attribute to have an optional type target field/property. It could be possible to permit `UnsafeAccessorKind.Constructor` when the target type is a sub-class of a visible type. Is there a scenario where this is practical? ### Risks _No response_

Author:	AaronRobinsonMSFT
Assignees:	-
Labels:	`api-suggestion`, `area-System.Runtime.CompilerServices`, `untriaged`
Milestone:	-

stephentoub commented 1 year ago

What's the downside to:

Expand the UnsafeAccessorAttribute attribute to have an optional type target field/property.

? That seems cleaner and easier to reason about than needing a second attribute.

AaronRobinsonMSFT commented 1 year ago

/cc @fanyang-mono @lambdageek

jkotas commented 1 year ago

[AttributeUsage(AttributeTargets.Method, AllowMultiple = false, Inherited = false)] public sealed class UnsafeAccessorTypeAttribute : Attribute

The idea in https://github.com/dotnet/runtime/issues/81741#issuecomment-1421937923 was that this attribute is applied to parameter and return types. We need to be able to assign the private type for each parameter and return type to be able to call arbitrary methods with non-visible types in the signature.

internal class A
{
  private class B
  {
  }

  private class C
  {
  }

   // How can one call this method using `UnsafeAccessor`? 
   private void M(B a, C b)
   {
   }
}

AaronRobinsonMSFT commented 1 year ago

The idea in https://github.com/dotnet/runtime/issues/81741#issuecomment-1421937923 was that this attribute is applied to parameter and return types.

Ah. I was focusing too much on the static scenario. Expanding the target doesn't address all the issues. I'll clarify that in an example.

stephentoub commented 1 year ago

The idea in https://github.com/dotnet/runtime/issues/81741#issuecomment-1421937923 was that this attribute is applied to parameter and return types

What would the full signature of the unsafe accessor method for C.M look like in that case?

AaronRobinsonMSFT commented 1 year ago

The idea in #81741 (comment) was that this attribute is applied to parameter and return types

What would the full signature of the method look like on that case?

Something like the following.

/// Assembly NonVisibleTypes.dll
internal class A
{
  private class B
  {
  }

  private class C
  {
  }

   // How can one call this method using `UnsafeAccessor`? 
   private static void M(B a, C b)
   {
   }
}

// Consuming assembly
[UnsafeAccessor(UnsafeAccessorKind.StaticMethod, Name = "M")]
[UnsafeAccessorType("A, NonVisibleTypes")]
static extern void CallMethod(
    [UnsafeAccessorType("A+B, NonVisibleTypes")] object a,
    [UnsafeAccessorType("A+C, NonVisibleTypes")] object b);

jkotas commented 1 year ago

Nit: This is instance method so you need to have the this pointer in the signature.

[UnsafeAccessor(UnsafeAccessorKind.Method, Name = "M")]
static extern void CallMethod(
    [UnsafeAccessorType("A, NonVisibleTypes")] object @this,
    [UnsafeAccessorType("A+B, NonVisibleTypes")] object a,
    [UnsafeAccessorType("A+C, NonVisibleTypes")] object b);

(This is for my example that has instance method.)

AaronRobinsonMSFT commented 1 year ago

Nit: This is instance method so you need to have the this pointer in the signature.

[UnsafeAccessor("M")]
static extern void CallMethod(
    [UnsafeAccessorType("A, NonVisibleTypes")] object @this,
    [UnsafeAccessorType("A+B, NonVisibleTypes")] object a,
    [UnsafeAccessorType("A+C, NonVisibleTypes")] object b);

Yep. Let me update the examples. I've been a bit loose here.

jkotas commented 1 year ago

We will need to decide whether the implementation should do the casts from object to the actual type as regular throwing casts or as unsafe casts. There is a good argument that can be made for either option.

AaronRobinsonMSFT commented 1 year ago

regular throwing casts or as unsafe casts

If this is for highest performance scenarios, I think the "unsafe casts" is what we want. The obvious downside here is destabilizing the runtime and creating painful bugs to hunt down. I think the question is do we expect users of the UnsafeAccessor API to directly expose these private accessor APIs from their public API surface. If so, which would be sad, then we should go with the throwing casts. If we expect "unsafe" to mean that and let people get into trouble, then we can do the unsafe cast.

hamarb123 commented 1 year ago

Will you be able to call methods which take private structs / use the types in generics? If so what type would you write?

// Assembly A
public class C
{
    private struct D { }
    private static int Method1(D d) { ... }
    private static int Method2(ref D d) { ... }
    private static int Method3(delegate* managed<List<D>, D*, TypedReference*, void> d) { ... }
    private static int Method4(delegate* managed<in D, void> d) { ... }
    private static int Method4(delegate* managed<ref D, void> d) { ... } //to throw off resolution
}

// Assembly B
[UnsafeAccessor(UnsafeAccessorKind.StaticMethod, Name="Method1")]
static extern int CallMethod1(??? c, int a);

[UnsafeAccessor(UnsafeAccessorKind.StaticMethod, Name="Method2")]
static extern int CallMethod2(??? c, int a);

[UnsafeAccessor(UnsafeAccessorKind.StaticMethod, Name="Method3")]
static extern int CallMethod3(??? c, int a);

[UnsafeAccessor(UnsafeAccessorKind.StaticMethod, Name="Method4")]
static extern int CallMethod4(??? c, int a);

My guess would be that all of the following would be allowed:

[UnsafeAccessor(UnsafeAccessorKind.StaticMethod, Name="Method1")]
static extern int CallMethod1([UnsafeAccessorType("C+D, A")] ref byte c, int a); //allowed since D is a struct

[UnsafeAccessor(UnsafeAccessorKind.StaticMethod, Name="Method2")]
static extern int CallMethod2([UnsafeAccessorType("C+D&, A")] ref byte c, int a); //allowed since parameter is a ref

[UnsafeAccessor(UnsafeAccessorKind.StaticMethod, Name="Method1")]
static extern int CallMethod1_Alt([UnsafeAccessorType("C+D, A")] TypedReference c, int a); //allowed for any type of parameter which you can take a TypedReference to - unchecked type?

[UnsafeAccessor(UnsafeAccessorKind.StaticMethod, Name="Method2")]
static extern int CallMethod2_Alt([UnsafeAccessorType("C+D&, A")] TypedReference c, int a); //allowed for any type of parameter which you can take a TypedReference to - unchecked type since it's a ref

[UnsafeAccessor(UnsafeAccessorKind.StaticMethod, Name="Method3")]
static extern int CallMethod3([UnsafeAccessorType("What atrocity would go here?")] void* c, int a); //allowed since it takes a pointer

[UnsafeAccessor(UnsafeAccessorKind.StaticMethod, Name="Method4")]
static extern int CallMethod4([UnsafeAccessorType("Whatever goes here for delegate* managed<in D, void>)")] void* d, int a); //could we allow specifying modifiers with this? this way it could still be used in any cases of ambiguity, e.g. between fn pointer overloads but when the type has some type parameter of a private type.

stephentoub commented 1 year ago

regular throwing casts or as unsafe casts

If this is for highest performance scenarios, I think the "unsafe casts" is what we want.

If no validation is being performed, what benefit is the attribute providing? Presumably we're not attempting to provide any kind of overload resolution; without that or validation, what purpose does providing a type name serve?

AaronRobinsonMSFT commented 1 year ago

If no validation is being performed, what benefit is the attribute providing?

Member resolution pertaining to overloads would be one.

AaronRobinsonMSFT commented 1 year ago

Presumably we're not attempting to provide any kind of overload resolution;

We do perform a signature look up on the matching member. For example, in the following scenario we would select the proper M().

public class A
{
    private class B() {}
    private class C() {}
    private void M(B b) { }
    private void M(C b) { }
}

stephentoub commented 1 year ago

Presumably we're not attempting to provide any kind of overload resolution

We do perform a signature look up on the matching member.

Overload resolution is insanely complicated. What rules are we using? Are we matching C# overload resolution rules? Do we layer on top of that rules pertaining to things that don't exist in C#, like overloading on return type?

Or is this not really overload resolution and we're just requiring every type to match 100% with its counterpart, i.e. no base types, no derived types, etc.? If so, I understand now.

AaronRobinsonMSFT commented 1 year ago

Overload resolution is insanely complicated.

Or is this not really overload resolution and we're just requiring every type to match 100% with its counterpart, i.e. no base types, no derived types, etc.? If so, I understand now.

The latter. We aren't performing the complete .NET overload resolution logic at all. We focus on the specific type, don't walk the type hierarchy, and match on full .NET signature (minus custom modifiers). If we detect ambiguity we perform the look-up again but include custom modifiers. If we still detect an ambiguity, then we throw an AmbiguousMatchException.

Ambiguous test:

https://github.com/dotnet/runtime/blob/c0967dcfc1fc280b57f92031956331c60b860fc5/src/tests/baseservices/compilerservices/UnsafeAccessors/UnsafeAccessorsTests.cs#L510-L523

Sergio0694 commented 1 year ago

I'm not seeing a way to manage fields of internal types, would it be possible to also support:

Referenced assembly

public struct A
{
    private B _b;
}

internal struct B
{
    private C _c;
}

internal struct C;

Consuming assembly

// Get a field of a private type, from a public type instance
[UnsafeAccessor(UnsafeAccessorKind.Field, Name = "_b")]
[return: UnsafeAccessorType("B, AssemblyName")]
public static extern ref byte GetB(ref A a);

// Get a field of a private type, from a private type instance
[UnsafeAccessor(UnsafeAccessorKind.Field, Name = "_c")]
[return: UnsafeAccessorType("C, AssemblyName")]
public static extern ref byte GetC([UnsafeAccessorType("B, AssemblyName")] ref byte b);

Return type can always be ref byte and then people can reinterpret as needed (or pass to another unsafe accessor). Params of reference types can be object instead of ref byte.

lambdageek commented 1 year ago

This is already called out a bit in https://github.com/dotnet/runtime/issues/90081#issuecomment-1667005737, but I think it's worth asking explicitly. How will this attribute work with generics (once we support UnsafeAccessorAttribute on open generic types).


// Module1
internal class G<K>  {
   internal List<K> GetElements() {... };
};

// Module2
[UnsafeAccessor(UnsafeAccessorKind.Field, Name="GetElements")]
[UnsafeAccessorType("G`1, Module2")] // is this ok?
//[UnsafeAccessorType("G`1<!!0>, Module2")] // or do we need to spell out the instantiation with the !!0 ("T") mvar below?
[return: UnsafeAccessorType("System.Collections.Generic.List<!!0>, System.Runtime")] // how to specify the method return type?
object GetElementsAccessor<T>();

A fully qualified or partially qualified type name.

What will be the scope for type parameters? Or do we use the !!digit syntax from IL? (presumably the scope is the unsafe accessor method declaration, so ELEMENT_TYPE_MVAR ranging over the generic parameters of the unsafe accessor generic method)

(I think methods are simpler but for fields we might also need to be explicit about return types and the type of self (for generic structs) being byref?)

phizch commented 1 year ago

How about using compiler preprocessing directives instead? A directive could tell the compiler to completely turn off visibility checking for everything. The compiler would treat it as if all assemblies had used InternalsVisibleToAttribute, but with the addition that private and protected members are also exposed.

Some might argue that this is dangerous, but so is Reflection, UnsafeAccessorAttribute, and a lot of other things. I say "let's go all in" and make everything possible. Libraries can't depend on internals for security in any case, and if some private method is removed in the library, it's on the developer that accessed it anyway to work around it.

The biggest problem I can see is how to "expose" structs from other assemblies. Even wrapping them would be a problem for ref structs, I think.

I called the directive pragma internals in my example below. Maybe it should also be possible to only enable visibility for some things, so that the developer using the feature doesn't get totally swamped, but that can be done with the directive too. E.g. #pragma internals visible System.IO.Path or #pragma internals visible protected,internal.

Of course, if this is implemented, it would make ~~both UnsafeAccessorAttribute and~~ UnsafeAccessorTypeAttribute unnecessary.

Edit: UnsafeAccessorAttribute would still be useful, since it can be used to access compiler generated fields from primary constructors and properties.

// Module1
namespace Module1;

struct Matrix3x3
{
   public Vector3 R1, R2, R3;
  ...
}

internal ref struct A
{
  public ref int SomeValue;
  public readonly Span<Matrix3x3> Matrices;
}

internal class B
{
  private int _something;
  internal string Text { get; private set; }
}

//----

// Module2
#pragma internals enable

namespace Module2;

var a = new Module1.A();
var keyLength = System.IO.Path.KeyLength; // private const int KeyLength = 8;

string path = @"C:\Users\MyName\Documents";
var ix = System.IO.Path.GetDirectoryNameOffset( path ); // internal static int GetDirectoryNameOffset(ReadOnlySpan<char> path)

// keep the memory layout from Module1.Matrix3x3
struct Matrix3x3
{
   public Vector3 R1, R2, R3;
  ...
}

// this way of wrapping won't work.
ref struct ImpossibleWrappedA
{
  ref Module1.A _wrappedA; // Error CS9050: A ref field cannot refer to a ref struct.
}

// no idea how to do this properly. Or in this case why.. Silly example.
ref struct WrappedA
{
  public ref int SomeValue;
  public Span<Matrix3x3> Matrices
  internal WrappedA( Module1.A toWrap )
  {
    SomeValue = ref toWrap.SomeValue;
    Matrices = MemoryMarshal.Cast<Module1.Matrix3x3, Matrix3x3>( toWrap.Matrices );
  }
}

struct WrappedB
{
  Module1.B _value;
  // expose private field of wrapped object.

  public ref int Something => ref _value._something;
  public string Text
  {
    get => _value.Text;
    set => _value.Text = value; // set even though it's private in Module1.B
  }
}

// expose interal methods from System.IO.Path.
class PathEx
{
  public static int GetDirectoryNameOffset(ReadOnlySpan<char> path) 
      => System.IO.Path.GetDirectoryNameOffset( path );

}

#pragma internals disable

AndriySvyryd commented 8 months ago

I'd like to make sure that this proposal also covers types with unspeakable names. e.g. anonymous types and closure frame types.

Right now, we have to do things like this:

var scopedVariable = new { Value = 1 };
var value = Unwrap<int>(() => scopedVariable);

T Unwrap<T>(LambdaExpression expression)
{
    var member = (MemberExpression)expression.Body;
    var constant = (ConstantExpression)member.Expression!;
    var value = constant.Value!;
    var field = (FieldInfo)member.Member;
    var fieldValue = field.GetValue(value)!;
    var anonymousType = fieldValue.GetType();
    var property = anonymousType.GetProperty("Value")!;
    return (T)property.GetValue(fieldValue)!;
}

And my understanding is that with this proposal we'd be able to rewrite it to something like the following (generated by a source generator so the names are up-to-date):

int Unwrap(LambdaExpression expression)
{
    var member = (MemberExpression)expression.Body;
    var constant = (ConstantExpression)member.Expression!;
    var value = constant.Value!;
    var fieldValue = GetScopedVariable(value);
    return GetAnonymousValue(fieldValue);
}

[UnsafeAccessor(UnsafeAccessorKind.Field, Name = "scopedVariable")]
[return: UnsafeAccessorType("<>f__AnonymousType0`1[Int32]")]
extern static ref object GetScopedVariable([UnsafeAccessorType("Program+<>c__DisplayClass0_0")] object @this);

[UnsafeAccessor(UnsafeAccessorKind.Method, Name = "get_Value()")]
extern static int GetAnonymousValue([UnsafeAccessorType("<>f__AnonymousType0`1[Int32]")] object @this);

Or, to avoid referencing generic types, we could define an interface that matches the shape:

[return: UnsafeAccessorType(nameof(IAnonymousType))]
extern static ref object GetScopedVariable([UnsafeAccessorType("Program.<>c__DisplayClass0_0")] object @this);

[UnsafeAccessor(UnsafeAccessorKind.Method, Name = "get_Value()")]
extern static int GetAnonymousValue([UnsafeAccessorType(nameof(IAnonymousType))] object @this);

interface IAnonymousType
{
    int Value { get; }
}

MichalPetryka commented 8 months ago

static extern int CallMethod([UnsafeAccessorType("C+D, A")] object d); // Use attribute to look up type

What would be the equivalent for value types here btw? Or would the idea idea be that the runtime would always eliminate boxing for them? (which I guess would be problematic when passing by ref)

jkotas commented 8 months ago

What would be the equivalent for value types here btw

TypedReference was one of the options discussed in https://github.com/dotnet/runtime/issues/81741#issuecomment-1420577579 . It would require fixing the TypedReference limitations in C#.

roald-di commented 7 months ago

I'd like to make sure that this proposal also covers types with unspeakable names. e.g. anonymous types and closure frame types.

Not to sidetrack this conversation, but is there even a way to get the DisplayClass and other generated type names with roslyn? From what I have seen those names are generated during lowering and are not available through the api.

AndriySvyryd commented 7 months ago

Not to sidetrack this conversation, but is there even a way to get the DisplayClass and other generated type names with roslyn? From what I have seen those names are generated during lowering and are not available through the api.

Right. To avoid having to reference those names we would need to create an interface for every one of these types.

mikernet commented 6 months ago

We will need to decide whether the implementation should do the casts from object to the actual type as regular throwing casts or as unsafe casts. There is a good argument that can be made for either option.

Safe cast in debug builds, unsafe in release builds :)

AaronRobinsonMSFT commented 5 months ago

@AndriySvyryd Is this needed for .NET 9? This isn't high on my priority list and I was going to start this in .NET 10.

AndriySvyryd commented 5 months ago

@AaronRobinsonMSFT It's not high priority, we can use reflection as a workaround for now

AaronRobinsonMSFT commented 5 months ago

@AndriySvyryd Great. This will likely be done in vNext, unless we hear of a blocking issue.

hez2010 commented 1 month ago

This looks a bit limited to me, as you cannot passing a ref struct to an object parameter. How about allowing users to define a "proxy" type for use instead of putting the UnsafeAccessorTypeAttribute on the unsafe accessor directly?

Private type

// Assembly A
private class C
{
    private static int Method(int a) { ... }
}

Usage:

// Assembly B
[UnsafeAccessorType("C, A")]
class CProxy { }

[UnsafeAccessor(UnsafeAccessorKind.StaticMethod, Name = "Method")]
extern static int M(CProxy target, int a);

var proxy = new CProxy();
M(proxy, 42);

Static type

// Assembly A
public static class C
{
    private static int Method(int a) { ... }
}

Usage:

// Assembly B
[UnsafeAccessorType("C, A")]
class CProxy { }

[UnsafeAccessor(UnsafeAccessorKind.StaticMethod, Name = "Method")]
extern static int M(CProxy target, int a);

var proxy = new CProxy();
M(proxy, 42);

Private class parameters

// Assembly A
public class C
{
    private class D { }
    private static int Method(D d) { ... }
}

Usage:

// Assembly B
[UnsafeAccessorType("C+D, A")]
class DProxy { }

[UnsafeAccessor(UnsafeAccessorKind.StaticMethod, Name = "Method")]
extern static int M(C target, DProxy d);

M(new C(), new DProxy());

This will allow us to define a struct or ref struct as "proxy" type to be used in the unsafe accessor.

AaronRobinsonMSFT commented 1 month ago

The proxy approach is interesting and something we can consider, but I'm inclined to not try to fix this in a suboptimal manner. Our goal here would be to eventually use TypedReference for cases like ByRefLike structs or structs in general. My plan is to implement this only supporting reference types initially. My reasoning for this is very simple. The other name for this feature is "fast reflection", boxing isn't fast and therefore permitting it is anathema to the goal of what we're trying to provide with this API. When TypedReference is more in a usable state, both value types and ByRefLike types would naturally fall out in an optimal manner.

hamarb123 commented 1 month ago

The proxy approach is interesting and something we can consider, but I'm inclined to not try to fix this in a suboptimal manner.

I quite like the proxy approach personally; I think it's the best idea I've seen so far, and solves the problem in full generality without unnecessary boxing or indirection. If it was implemented so that UnsafeAccessorType was like / similar to a type forward with renaming allowed & permission checks removed (if any) on the runtime level, this should just work. The idea would be that you can just write the same signature, but with your type definition that gets forwarded. You would require the type of type to match as well, e.g., you must write class for class, struct for struct, interface for interface, etc., + allowed to add interfaces that it implements / base type as long as they are true - otherwise the proxy type is invalid and an exception will be thrown at some point to indicate this (presumably when you call the unsafe accessor method).

Our goal here would be to eventually use TypedReference for cases like ByRefLike structs or structs in general. My plan is to implement this only supporting reference types initially. My reasoning for this is very simple. The other name for this feature is "fast reflection", boxing isn't fast and therefore permitting it is anathema to the goal of what we're trying to provide with this API. When TypedReference is more in a usable state, both value types and ByRefLike types would naturally fall out in an optimal manner.

This (using TypedReferences) feels more useful for general reflection (i.e., dynamically selecting the type & member, etc.) to me, which isn't what we're trying to solve here. Forcing TypedReference doesn't seem necessary or beneficial to solve the limitations of UnsafeAccessor to me.

AaronRobinsonMSFT commented 1 month ago

If it was implemented so that UnsafeAccessorType was like / similar to a type forward with renaming allowed & permission checks removed (if any) on the runtime level, this should just work.

This isn't true. It wouldn't work for return types that were struct or ref struct - the metadata would be a lie for all aspects of the scenario. For reference types, object is "safe" since there is no way to speak the type and we won't get into cases where the proxy would have methods that aren't on the actual type. Additionally, all methods and semantics of object would follow naturally.

For value types and ByRefLike types, the situation would be even more suspect if they contained state, as the definition would be a lie and the local allocation would create issues. The TypedReference approach allows us to be clever and handle cases where we may need stack space during the JIT of the calling method.

The proxy approach would really only work for "static" scenarios and only when it is the target of a dispatch call. However, this still creates issues with creating an unsafe definition of a type that isn't that type.

hamarb123 commented 1 month ago

For value types and ByRefLike types, the situation would be even more suspect if they contained state, as the definition would be a lie and the local allocation would create issues. The TypedReference approach allows us to be clever and handle cases where we may need stack space during the JIT of the calling method.

The idea is that the runtime treats it like it actually is the type mentioned in the attribute - similar to a type forward, so declaring it as a local would be fine if allowed.

The proxy approach would really only work for "static" scenarios and only when it is the target of a dispatch call.

Not sure what you mean by this. It should work much at least as generally & more easily than specifying the type in a string.

the metadata would be a lie for all aspects of the scenario

It doesn't seem like more of a lie than something like InlineArray or TypeForwardedTo to me. Or even ref assemblies, since they often lie about types of fields in structs.

jkotas commented 1 month ago

The idea is that the runtime treats it like it actually is the type mentioned in the attribute - similar to a type forward.

So for example, ReferenceEquals(typeof(DProxy), Type.GetType("C+D, A")) would be true?

This would be a new fundamental type system feature, orthogonal to the existing UnsafeAccessor. It would not be a thing on the side that does not interact much with the rest of the system like the current UnsafeAccessor. It would require changing all places that resolve token to type at runtime to be aware of it. We have done similar type forwarding in built-in runtime WinRT support and it was complicated. For example, the built-in WinRT ended up requiring compilers to always reference the forwarded types via typeref tokens to make the runtime support feasible.

hamarb123 commented 1 month ago

The idea is that the runtime treats it like it actually is the type mentioned in the attribute - similar to a type forward.

So for example, ReferenceEquals(typeof(DProxy), Type.GetType("C+D, A")) would be true?

This would be a new fundamental type system feature, orthogonal to the existing UnsafeAccessor. It would not be a thing on the side that does not interact much with the rest of the system like the current UnsafeAccessor. It would require changing all places that resolve token to type at runtime to be aware of it. We have done similar type forwarding in built-in runtime WinRT support and it was complicated. For example, the built-in WinRT ended up requiring compilers to always reference the forwarded types via typeref tokens to make the runtime support feasible.

Hmm ok, makes sense. Perhaps we could do it like this then: make [UnsafeAccessorType] (or whatever it's called) can only be used on a struct type with no fields & causes the type to be laid out identically to the type referenced by name in the attribute (including specifically guaranteeing: you can exchange by-references/pointers of them, and can interchange in function pointer signatures, etc.), and allow UnsafeAccessor to treat it as identical to the underlying type, similar to how we allow ignoring modreqs/modopts. I.e., basically, it'd be a transparent struct but allowing referring to the underlying type by name, instead of having it as a direct field, with special handling in UnsafeAccessor (since this is what the feature exists for). We may also need some special handling with TypedReference/unboxing too, but this could also be solved by having some more advanced TypedReference APIs (specifically, TypedReference TypedReference.CreateUnsafe(byte&, Type), byte& TypedReference.GetReferenceUnsafe(), and byte& Unsafe.UnsafeUnbox(object), or similar).

This should mostly side-step the issues with complicating the type system by having more mechanisms of type forwarding I'd think, since it just guarantees specific things about layout & guarantees about compatibility of some types e.g., in function pointer signatures (which would be unsafe to write/consume anyway). A lot of this work would be done if we did transparent type layout feature anyway.

Maybe this has some issues too, but I thought it was at least worth mentioning as an option :)

Edit: I suppose this would have some issues with generics actually.

AaronRobinsonMSFT commented 1 month ago

The proxy approach would really only work for "static" scenarios and only when it is the target of a dispatch call.

Not sure what you mean by this. It should work much at least as generally & more easily than specifying the type in a string.

I was clarifying that to be safe, it would only be possible to call a static method or dispatch from this proxy type. It wouldn't be safe to return anything because of the type issue that mentioned https://github.com/dotnet/runtime/issues/90081#issuecomment-2395273456 and the general safety issues it is violating for what can be done with the type.

the metadata would be a lie for all aspects of the scenario

It doesn't seem like more of a lie than something like InlineArray or TypeForwardedTo to me. Or even ref assemblies, since they often lie about types of fields in structs.

This is a false equivalent comparison. The InlineArray is a well-defined semantic that is handled by the Type Loader and all the type information is available in metadata and is simply expanded. TypeForwardedTo is simply pointing were to look, not defining a new name for a type.

Perhaps we could do it like this then: make [UnsafeAccessorType] (or whatever it's called) can only be used on a struct type with no fields & causes the type to be laid out identically to the type referenced by name in the attribute (including specifically guaranteeing: you can exchange by-references/pointers of them, and can interchange in function pointer signatures, etc.)

This would make the feature require additional language enforcement, which the current proposal has no such requirement or need to be used "safely". It would make using the feature in other .NET languages more unsafe than in C#.

Can you elaborate on the issue being observed here? The object and TypedReference approach ensure type safety and allow us to operate with minimal overhead. The proxy suggestion, at first glance, only has substantial added costs on implementation and introduces complexity outside of the runtime.

hamarb123 commented 1 month ago

@AaronRobinsonMSFT I've changed my mind after @jkotas 's comments & thinking about it further. I am now not convinced the proxy type approach will be any better overall (and probably is much more complex to implement), assuming we can encode any type into the string format somehow.

teo-tsirpanis commented 3 weeks ago

I've been thinking about this lately and came up with a design similar to @hez2010's proxy types. To address some of the feedback from that proposal, we don't have to add support for full type equivalence in the runtime and can keep the type proxying scoped to signature matching for [UnsafeAccessor] methods. We can do that by never instantiating values of proxy types and instead using the proxy type as the type parameter of a wrapper struct:

namespace System.Runtime.CompilerServices;

public ref struct UnsafeAccessorValue<T>
{
    public static implicit operator UnsafeAccessorValue(object? obj);
    public static implicit operator object? (UnsafeAccessorValue<T> value);
    // After we add language support for TypedReference:
    // public static implicit operator UnsafeAccessorValue(TypedReference typedref);
    // public static implicit operator TypedReference (UnsafeAccessorValue<T> value);
}

// For passing by-ref parameters we might need an UnsafeAccessorReference<T> type.
// We can also name them UnsafeAccessorBy(Value|Reference)

On signature matching, if UnsafeAccessorValue<T> appears in a parameter of an [UnsafeAccessor] method, the runtime will substitute it with T. And if T is marked with [UnsafeAccessorType], the runtime will substitute it with the type specified in that attribute. On code generation, parameters of type UnsafeAccessorValue<T> will be unwrapped to object or TypedReference once we have the language and API support.

Examples

Instance method on internal type

// Assembly A
internal class C
{
    private int Method(int a) { ... }
}

Usage

// Assembly B
[UnsafeAccessorType("C, A")]
class CProxy;

[UnsafeAccessor(UnsafeAccessorKind.Method, Name = "Method")]
extern static int M(UnsafeAccessorValue<CProxy> target, int a);

object obj = /* Obtain an object of type C somehow. */;
M(obj, 42);

Static type

// Assembly A
public static class C
{
    private int Method(int a) { ... }
}

Usage

// Assembly B
[UnsafeAccessorType("C, A")]
class CProxy;

[UnsafeAccessor(UnsafeAccessorKind.StaticMethod, Name = "Method")]
// It's a static class, we won't instantiate it and don't have to wrap it in an UnsafeAccessorValue.
extern static int M(CProxy target, int a);

M(null, 42);

Private class parameters

// Assembly A
public class C
{
    private class D;
    private static int Method(D d) { ... }
}

Usage:

// Assembly B
[UnsafeAccessorType("C+D, A")]
class DProxy;

[UnsafeAccessor(UnsafeAccessorKind.StaticMethod, Name = "Method")]
extern static int M(C target, UnsafeAccessorValue<DProxy> d);

object obj = /* Obtain an object of type D somehow. */;
M(null, obj);

Creating internal class

// Assembly A
internal class C
{
    public C(int x) { ... }
}

Usage:

// Assembly B
[UnsafeAccessorType("C, A")]
class CProxy;

[UnsafeAccessor(UnsafeAccessorType.Constructor)]
extern static UnsafeAccessorValue<C> M(int x);

object obj = M(42);

This example can naturally extend to picking an overload by return type, which is not currently possible in C#.

I think returning an unspecified ref struct is impossible so the runtime should throw InvalidProgramException in that case.

AaronRobinsonMSFT commented 3 weeks ago

And if T is marked with [UnsafeAccessorType], the runtime will substitute it with the type specified in that attribute.

This would make for an increase of complexity actually and I'm not seeing the gain. Without the UnsafeAccessorTypeAttribute feature looking up the member signature is correct after the first argument is ignored. However, the approach being proposed with UnsafeAccessorValue<T> would mean that the signature would need to potentially edited in order to do any look-up. The provided examples are trivial because they all involve the first argument.

Consider the case where the private type(s) are an argument to the member in the middle. This would mean the signature on D:.ctor would need to be rewritten instead of the current proposal where it would be correct and class A and class B would be ELEMENT_TYPE_CLASS not ELEMENT_TYPE_VALUETYPE.

// Assembly A
internal class A;
internal class B;
internal struct C;

internal class D
{
    public D(A a, B b, C c) { ... }
}

AaronRobinsonMSFT commented 3 weeks ago

This would make for an increase of complexity actually and I'm not seeing the gain.

Actually, we're going to need to rewrite the signature anyways. Boo.

What is the actual gain/clarity provided by using UnsafeAccessorValue<T> as opposed to:

Reference types: [UnsafeAccessorType("A")] object a Value types: [UnsafeAccessorType("A")] TypedReference a

Note this is considered an advanced scenario and will generally be used as an internal implementation detail for source generated code. It is not designed/intended to be used in general scenarios.

bartonjs commented 2 weeks ago

Video

Removed AttributeTargets.Method from the AttributeUsage based on discussion.
Changed TypeName to get-only per Attribute design guidelines.

namespace System.Runtime.CompilerServices;

[AttributeUsage(AttributeTargets.Parameter | AttributeTargets.ReturnValue, AllowMultiple = false, Inherited = false)]
public sealed class UnsafeAccessorTypeAttribute : Attribute
{
    /// <summary>
    /// Instantiates an <see cref="UnsafeAccessorTypeAttribute"/> providing access to a type supplied by <paramref name="typeName"/>.
    /// </summary>
    /// <param name="typeName">A fully qualified or partially qualified type name.</param>
    /// <remarks>
    /// <paramref name="typeName"> is expected to follow the same rules as if it were being
    /// passed to <see name="Type.GetType(String)"/>.
    /// </remarks>
    public UnsafeAccessorTypeAttribute(string typeName)
    {
        TypeName = typeName;
    }

    /// <summary>
    /// Fully qualified or partially qualified type name to target.
    /// </summary>
    public string TypeName { get; }
}

dotnet / runtime