dotnet / roslyn

The Roslyn .NET compiler provides C# and Visual Basic languages with rich code analysis APIs.
https://docs.microsoft.com/dotnet/csharp/roslyn-sdk/
MIT License
19.02k stars 4.03k forks source link

Proposal: Reflection patterns #8544

Closed p-e-timoshenko closed 7 years ago

p-e-timoshenko commented 8 years ago

Let's look to the example of “INotifyPropertyChanged” interface implementation. Properties could have values by default described by “System.ComponentModel.DefaultValueAttribute”. The value of attribute can be only obtained by reflection.

[DefaultValue(DEFAULT_SOME_TEXT)]
public String SomeText {
  get { return this._someText; }
  set {
    if (String.IsNullOrEmpty(value)) {
      value = GetType()
        .GetProperty(nameof(SomeText))
        .GetCustomAttribute<DefaultValueAttribute>().Value as String;
    }

    if (String.Equals(this._someText, value,
        StringComparison.CurrentCultureIgnoreCase))
      return;

    this._someText = value;

    OnPropertyChanged();
  }}

The example above demonstrates how the “nameof” operator is useful in reflection. However, it shows the weakness of reflection. It's necessary to perform too much low-productive operations to obtain required attribute. The code scope is known at compile time, but it needs to obtain it “dynamically” by processing of countless string arguments and list enumerations. But the getting attribute information is the most popular action of reflection. It should be simple and more efficient. Also a reflection is poorly compatible with the code obfuscators.

After looking at “PropertyChangedEventArgs” class in “PropertyChanged” event again, appears a clear understanding that, perhaps, “PropertyName” should be named “Property” with type “System.Reflection.PropertyInfo”. This type has the correctly implemented methods “Equals” and “ToStrinsg” and has a sufficient information to make a decision how it should compare and convert to string. However, the classes located in “System.Reflection” namespace aren't advisable to use for some reasons. It seems that the dynamics is slow and it breaks down the object oriented principles. Therefore it should be kept separately and be used only in rare cases. But in real projects, the situation around the reflection in C# is similar of the “#define” preproccessor directive being used in C++: “C Preprocessor tricks, tips, and idioms” or “What are some tricks I can use with macros?”. It's required to use it only where impossible avoid it. However, in real projects, the directive is applied everywhere. Today, there is a wide gap between native programming and meta-programming (reflecting, DLR) in C#. Perhaps, the gap needs to be reduced. Let's imagine how it might be done.

A lot of arguments of reflected elements passed to its methods have string type. Performing string conversions and dynamic binding requires considerable time. However, it's often necessary to refer to an existing code element placed in the current or public context (scope) or to reflect it. If this is taken into account, it's possible to significantly improve the performance. Let's consider the following code.

namespace ReflectionPatterns {
  public reflecting field pattern _field1; // Any field with name “_field1”
  public reflecting field pattern Int32 _field2;
  [RequireAttribute(typeof(SerializableAttribute))]
  public reflecting field pattern Int32 _field3;
  [TypeCast][AnyName][Static][MemberScope(Scope.Private | Scope.Internal)]
  public reflecting field pattern Object PrivateOrInternalFieldWithAnyName;
  [TypeCast][RegularExpression(@"^_field\d+$")]
  public reflecting field pattern Object FieldWithRegExprName;

  public reflecting property pattern Property1; // Any property with name “Property1”
  public reflecting property pattern Int32 Property2 {get;} // Getter is required
  public reflecting property pattern Int32 Property3 {set;} // Setter is required
  [TypeCast][AnyName][field:RequireAttribute(typeof(SerializableAttribute))]
  public reflecting property pattern Int32 PropertyWithAnyName {get; set;}
  [RegularExpression(@"^Property\d+$")]
  public reflecting property pattern PropertyWithRegExprName;

  public reflecting event pattern Event1; // Any event with name “Event1”
  public reflecting event pattern EventHandler Event2;
  [TypeCast][AnyName][MemberScope(Scope.Public | Scope.Protected)]
  public reflecting event pattern EventHandler PublicOrProtectedEventWithAnyName;
  [TypeCast][RegularExpression(@"^.*?Event(?:\d+)?$")]
  public reflecting event pattern EventWithRegExprName;

  public reflecting method pattern Method; // Any input and output arguments of
  // the method having name “Method”
  public reflecting method pattern Int32 Method1; // Any input arguments of
  // the method having name “Method1”
  [return:RequireAttribute(typeof(OutAttribute))]
  public reflecting method pattern Int32 Method2<T> where T:struct;
  // The pattern having dynamic substituted parts
  [DynamicType("MethodType")][DynamicName("MethodName")] 
  public reflecting method pattern Int32 MethodWithDynamicParts(
    [RegularExpression(@"^arg\d*$")] [DynamicName("ArgumentName")] T arg);

  [RegularExpression(@"^.*?Class(?:\d+)?$")]
  [BaseClassName(@"^Base.*?Class$")][InterfaceName(@"^I.*?Connection$")]
  [NestedPatern(Class1, Method1, Property1)][RequiredMembers(typeof(IInterface))]
  public reflecting class pattern ClassRegExprName : IInterface1, Iinterface2;
  // The class pattern may be used to define attribute patterns
}

The reflecting pattern definition could be written and placed just like delegates. The patterns describe intuitively clear rules to find classes and its members statically and dynamically. The rules are a kind of dynamic language like “LINQ”. It adds native meta-information querying capabilities by the reflecting patterns and the set of extension methods. The patterns allows developers to understand visually that it is required to find. It also allows to check the syntax and to simplify searching queries during the compilation. In many cases the results of reflection can be given at the compile time. Getting the information corresponding to the reflecting pattern is performed by “reflect(target)” operator. The template parameter “T” is reflection pattern or type being attribute or nesting to assembly or another type. The argument “target” is class instance, type or string.

var fieldInfo1 = reflect<_field1>(this); // Exact match
var fieldInfos = reflect<PrivateOrInternalFieldWithAnyName[]>(typeof(SomeType));

var fieldInfo01 = reflect<_field1>(); // It calls for current scope
// and finds exact math
class SomeType {
public FieldInfo InstanceGetFieldInfo() {
  var fieldInfo = reflect<_field1>();
  // It's equivalent to
  // FieldInfo fieldInfo = reflect<_field1>(GetType())
  return fieldInfo;
}

public static FieldInfo StaticGetFieldInfo() {
  var fieldInfo = reflect<_field1>();
  // It's equivalent to
  // FieldInfo fieldInfo = reflect<_field1>(typeof(SomeType))
  return fieldInfo;
}

This approach can significantly simplify the way of interaction with the attributes.

[DefaultValue(DEFAULT_SOME_TEXT)]
public String SomeText {
  get { return this._someText; }
  set {
    if (String.IsNullOrEmpty(value)) {
      value = reflect<DefaultValueAttribute>()?.Value as String;
    }

    if (String.Equals(this._someText, value,
        StringComparison.CurrentCultureIgnoreCase))
      return;

    this._someText = value;

    OnPropertyChanged();
  }}

The compiler can simplify the code presented above in the best way. Let's look at other examples.

// Find methods that match the pattern “MethodPattern”, 
// where the class placed in current assembly matches the pattern “ClassPattern”
MethodInfo[] mi = reflect<MethodPattern[], ClassPattern>();

// Find methods that match the pattern “MethodPattern”, 
// where the class placed in custom assembly matches the pattern “ClassPattern”
MethodInfo[] mi = reflect<MethodPattern[], ClassPattern>(
"MyAssembly, Version=1.0.0.0, Culture=neutral, PublicKeyToken=7779fa1be111eb0c");

// Maybe the reflection should be more strongly typed,
// then increase productivity and reduce errors.
FieldInfo<FieldPattern> fieldInfo = reflect<FieldPattern>();
MethodInfo<MethodPattern> methodInfo = reflect<MethodPattern>();
MethodInfo<PropertyPattern> propertyInfo = reflect<PropertyPattern>();

// Errors can be detected at compile time. The boxing/unboxing doesn't apply.
OldValue = propertyInfo.GetValue(this); propertyInfo.SetValue(this, NewValue);
methodInfo.Invoke(this, arg0, arg1, ..., argN);

// Using dynamically substituted parts (see example above)
MethodInfo<MethodWithDynamicParts> methodInfo = reflect<MethodWithDynamicParts>(
 "Namespace.Class", new {MethodType = typeof(Int32), ArgumentName="argument0"});
);

The example of getting and setting property values and calling the method by reflection shows that a further feature is required. It is “typed references”.

The typed reference defined by “reference(target)” operator. It's like “reflect” operator. The operator allows to make a link with property, event, method or field. Its implementation should be lightweight and be similar “references” in C++. Its performance should be close to the direct calling the part of code referred by one. The behavior of references should be similar to the members to which they refer. The assignment and equality of references could be defined as additional operations. In this case, the performance of NPC subscribers may be improved significantly by comparing the typed references instead of strings.

namespace ReflectedTypes {
  [AnyName]
  public reflecting property pattern PropertyPattern {get;}
  public reference PropertyRef : PropertyPattern; // It's like delegate definition
  // A warning “It's better not to refer to the field” is accompanied 
  // a reference to field. 

  class NPC : INotifyPropertyChanged {
    ...
    public Int32 Property {
      get { return _property; }
      set {
        if (_property == value) return;
        _property = property;
        PropertyChanged(reference<PropertyRef>(this));
         // also reference<PropertyRef>(), reflect<PropertyPattern>(this)
      }
    }
    private Int32 _property;
  }
  class Program {
    static void NPCSubscriber(Object sender, PropertyChangedEventArgs e) {
      PropertyRef propRef = e.Property as PropertyRef;
      if (propRef != null && propRef is reference(NPC.Property)) {
        Console.WriteLine("Type: {0}, name: {0}, value: {1}",
          propRef.Container.GetType(), propRef, propRef.Reference);
          // “Container” is instance or Type in case of reference to static member.
          // “Reference” is auto-generated property, event or method wrapper.
      }
    }

    static void Main(string[] args) {
      var obj = new NPC();
      obj.PropertyChanged += NPCSubscriber;
    }
}}

It may seem that the references are not necessary due to already existing delegates. However, their heavy functional possibilities shouldn't be required in many cases excepting events. All features of delegate (see source code of “Delegate” and “MulticastDelegate” classes) can be written transparently by typed references in ordinal classes without any hacks:

public abstract class Delegate : ICloneable, Iserializable {
  // _target is the object we will invoke on
  [System.Security.SecurityCritical]
  internal Object _target;

  // MethodBase, either cached after first request or assigned from a DynamicMethod
  // For open delegates to collectible types, this may be a LoaderAllocator object
  [System.Security.SecurityCritical]
  internal Object _methodBase;

  // _methodPtr is a pointer to the method we will invoke
  // It could be a small thunk if this is a static or UM call
  [System.Security.SecurityCritical]
  internal IntPtr _methodPtr;

  // In the case of a static method passed to a delegate, this field stores
  // whatever _methodPtr would have stored: and _methodPtr points to a
  // small thunk which removes the "this" pointer before going on
  // to _methodPtrAux.
  [System.Security.SecurityCritical]
  internal IntPtr _methodPtrAux;
...

By studying the code, it can be concluded that without the references it is not possible to write good-quality code. That is why the developers of C# create it by small hack (see “_methodPtr” or “_methodPtrAux” fields having internal accessibility). If there are no references, it have to use the slow reflection. There is a similar situation with classes placed in “System.Reflection” and “System.Dynamic” namespaces. So maybe it should append the typed references explicitly without any hacks?

The “reflect” and “reference” operators would be recommended to use for reflection. Maybe it's better approach compared with existing one. In most cases “.NET” would perform a pretty good optimization, because it's possible to predict what kind of information is needed at compile time. The various code could be created by compiler depending on the situation. Also it should be noted that the reflection becomes friendly to the developers by using ones.

alrz commented 8 years ago

Check out #5561 and #5292.

svick commented 8 years ago

I'm all for making metaprogramming better and more efficient, but I don't think this is the way to do it.

For simple cases, something like memberof() (#1653) is sufficient.

For complex cases, I don't understand why is introducing all this syntax necessary. If you want to separate defining a reflection pattern and using it, that should be doable with a library.

And references to methods and properties pretty much already exist, they're called delegates.

p-e-timoshenko commented 8 years ago

PropertyInfo<T> info = reflect<T>(MyProperty); The typed reflection avoids many errors and unnecessary boxing/unboxing operations. It can improve the performance. It is also very simple to describe very complex selection rules which can be optimized during compilation.

@svick Did you see the implementation of delegates? It needs references definitely.

HaloFour commented 8 years ago

I'm not sure how much improvement you'd see without some serious CLR/BCL improvements made in tandem. Classes like PropertyInfo<T> and the like would need to be added to the BCL. The ability to use these "patterns" as generic type arguments would need to be added to the CLR. At the end of the day it's just syntactic candy. Delegates, custom attribute resolution, reflection, would remain slow and clunky because it still has to be performed at runtime. Considering that a delegate invocation costs about as much as a virtual instance method call that's really not all that bad. I doubt that the other complicated use cases exist with enough frequency to justify the extensive syntax and runtime changes proposed here, and frankly I think that they'd be better solved through AOP anyway.

gafter commented 7 years ago

We are now taking language feature discussion in other repositories:

Features that are under active design or development, or which are "championed" by someone on the language design team, have already been moved either as issues or as checked-in design documents. For example, the proposal in this repo "Proposal: Partial interface implementation a.k.a. Traits" (issue 16139 and a few other issues that request the same thing) are now tracked by the language team at issue 52 in https://github.com/dotnet/csharplang/issues, and there is a draft spec at https://github.com/dotnet/csharplang/blob/master/proposals/default-interface-methods.md and further discussion at issue 288 in https://github.com/dotnet/csharplang/issues. Prototyping of the compiler portion of language features is still tracked here; see, for example, https://github.com/dotnet/roslyn/tree/features/DefaultInterfaceImplementation and issue 17952.

In order to facilitate that transition, we have started closing language design discussions from the roslyn repo with a note briefly explaining why. When we are aware of an existing discussion for the feature already in the new repo, we are adding a link to that. But we're not adding new issues to the new repos for existing discussions in this repo that the language design team does not currently envision taking on. Our intent is to eventually close the language design issues in the Roslyn repo and encourage discussion in one of the new repos instead.

Our intent is not to shut down discussion on language design - you can still continue discussion on the closed issues if you want - but rather we would like to encourage people to move discussion to where we are more likely to be paying attention (the new repo), or to abandon discussions that are no longer of interest to you.

If you happen to notice that one of the closed issues has a relevant issue in the new repo, and we have not added a link to the new issue, we would appreciate you providing a link from the old to the new discussion. That way people who are still interested in the discussion can start paying attention to the new issue.

Also, we'd welcome any ideas you might have on how we could better manage the transition. Comments and discussion about closing and/or moving issues should be directed to https://github.com/dotnet/roslyn/issues/18002. Comments and discussion about this issue can take place here or on an issue in the relevant repo.


I am not moving this particular issue because I don't have confidence that the LDM would likely consider doing this.