xamarin / xamarin-macios

.NET for iOS, Mac Catalyst, macOS, and tvOS provide open-source bindings of the Apple SDKs for use with .NET managed languages such as C#
Other
2.45k stars 511 forks source link

RFC: Migrate bgen to use roslyn instead of the reflection API. #21308

Open mandel-macaque opened 3 days ago

mandel-macaque commented 3 days ago

RFC: Migrate bgen to use roslyn instead of the reflection API.

The following is a proposal to migrate the current bgen implementation, based on the reflection API, to a roslyn generator. A proof of concept can be found at https://github.com/mandel-macaque/macios-generator

Introduction

The Microsoft.iOS/MacOS bindings relies in a custom tool called bgen. bgen generates code by reading a dll that contains all the binding definitions described as a interface. Each interface represents a native class, bgen loads the assembly, reads the metadata that was used to tag the interfaces and generates the final classes and any needed trampoline.

Although the approach was the best possible one when roslyn did not exists, the solution is limited by the reflection API and the fact that an intermediate generation process is needed outside the normal compilation. The following problems come to mind:

For developers:

  1. Complex API.
  2. No intellisense. The compiler has no access to the generated class.
  3. No proper support for nullable annotations (?|!)
  4. Complex build steps.
  5. Not roslyn analyzer

For the SDK result:

  1. Complex build system. Special generated classed that are used to decide if certain frameworks are present etc..
  2. Impossible to fix issues with the API.
  3. No IDE integration.
  4. Slow feedback loop. Need to compile to see binding errors.

Timing

Up until the release of net9 and C# 13 the language did not support partial properties which limited the possibility of using Roslyn for the binding generations since if would heavily impact the quality of the final API. The new language addition allows to have a Roslyn code generator that is able to cover all the binding cases.

Benefits

A Roslyn Code Generator has access to both the SyntaxTree and the Semantic model of the current compilation, both pieces allow to minimize the need of type annotations, adds better support for nullability and all possible new language features as well as it allows to share knowledge between a code generator and a Roslyn analyzer improving the feedback loop for developers and the quality of the SDK. Some of the direct benefits are:

  1. Code Intellisense between a binding project and a MAUI project. Right now if those to projects are present in a solution the developers IDE won't be able to use intellisense from the binding project.
  2. Smaller binding API, reducing the barrier of entry. (see API changes)
  3. Intellisense when writing new binding.
  4. Faster feedback loop with an analyzer and code generator diagnostics.
  5. Better nullability support in the final SDK (see bgen blocking issues).
  6. Multi language support. Currently bgen only support Objc, moving to a new solution would allow to target other languages, such as Swift.

Known issues with bgen

Due to the nature of the reflection API, there are several issues that the current generator cannot fix easily:

Build complexity

The build of the Microsoft.iOS SDK is very complex because it occurs in several phases. This makes most new developers struggle to understand in which of the available sections under frameworks.sources each of the files has to go. This is specially painful when we are adding structure that need to have the same size and layout in all steps of the build. Moving to Roslyn allows to all files be in the same compilation unit as well as it allows to have a csproj per platform to organize the build.

Having all code in the same build also allows to remove several classes that we have that are used to indentify which frameworks are present per platform. With all files in the same context, Roslyns semantic model can be used to find a symbol and decide if a class is generated or not.

API changes

Moving away from bgen to Roslyn means that we have to change the API to support intellisense on customer projects. The API change is 1:1 meaning that the migration from one version to another can be automated. Moving to Roslyn does not force our customers to move to Roslyn. bgen can be distributed customers as long as it is needed.

The following code snippets show the differences between API

Class definitions

bgen requires all classes to be interfaces, and use the BaseType attribute to define the class hierarchy:

[Abstract]
[BaseType (typeof (NSObject))]
interface UIFeedbackGenerator {

   [iOS (17, 5), MacCatalyst (17, 5)]
   [Static]
   [Export ("feedbackGeneratorForView:")]
   UIFeedbackGenerator GetFeedbackGenerator (UIView forView);

   [Export ("prepare")]
   void Prepare ();
}

With a move to Roslyn the bindings can be changed to use classes and normal class inheritance allow users to access the class in their current compilation unit.

[BindingType]
public partial abstract class UIFeedbackGenerator : NSObject {

   [Export ("feedbackGeneratorForView:")]
   public static partial UIFeedbackGenerator GetFeedbackGenerator (UIView forView);

   [Export ("prepare")]
   public virtual partial void Prepare ();
}

Protocol implementations in bgen are represented by class inheritance. That is complicated and in some occasions it forced developers to create empty interfaces as with the following example:

[Protocol]
interface UIInteraction {
   [Abstract]
   [Export ("view", ArgumentSemantic.Weak)]
   UIView View { get; }

   [Abstract]
   [Export ("willMoveToView:")]
   void WillMoveToView ([NullAllowed] UIView view);

   [Abstract]
   [Export ("didMoveToView:")]
   void DidMoveToView ([NullAllowed] UIView view);
}

[Abstract] // abstract class that should not be used directly
[BaseType (typeof (NSObject))]
interface UIFeedbackGenerator : UIInteraction {

   [Static]
   [Export ("feedbackGeneratorForView:")]
   UIFeedbackGenerator GetFeedbackGenerator (UIView forView);

   [Export ("prepare")]
   void Prepare ();
}

// NEEDED for intermediate compilation
interface IUIInteraction {}

interface UIView : UIAppearance, UIAppearanceContainer, UIAccessibility, UIDynamicItem, NSCoding, UIAccessibilityIdentification, UITraitEnvironment, UICoordinateSpace, UIFocusItem, UIFocusItemContainer
   , UITraitChangeObservable {

   [Export ("addInteraction:")]
   void AddInteraction (IUIInteraction interaction);

   [Export ("removeInteraction:")]
   void RemoveInteraction (IUIInteraction interaction);

   [Export ("interactions", ArgumentSemantic.Copy)]
   IUIInteraction [] Interactions { get; set; }
}

With Roslyn this becomes much simpler:

[BindingType, Protocol]
public interface IUIInteraction {
   [Export ("view", ArgumentSemantic.Weak)]
   public UIView View { get; }

   [Export ("willMoveToView:")]
   public void WillMoveToView ([NullAllowed] UIView view);

   [Export ("didMoveToView:")]
   public void DidMoveToView ([NullAllowed] UIView view);
}

[BindingType]
public partial abstract class UIFeedbackGenerator : IUIInteraction {
   [Export ("feedbackGeneratorForView:")]
   public static partial UIFeedbackGenerator GetFeedbackGenerator (UIView forView);

   [Export ("prepare")]
   public virtual partial void Prepare ();

}

[BindingType]
public partial class UIView : NSObject, IUIAppearance, IUIAppearanceContainer, IUIAccessibility, IUIDynamicItem, INSCoding, IUIAccessibilityIdentification, IUITraitEnvironment, IUICoordinateSpace, IUIFocusItem, IUIFocusItemContainer
   , IUITraitChangeObservable {

   [Export ("addInteraction:")]
   public virtual partial void AddInteraction (IUIInteraction interaction);

   [Export ("removeInteraction:")]
   public virtual partial void RemoveInteraction (IUIInteraction interaction);

   [Export ("interactions", ArgumentSemantic.Copy)]
   public virtual partial IUIInteraction [] Interactions { get; set; }
}

Class visibility is also simplified, while bgen requires classes to be marked with [Abstract] to state that they are, a roslyn code generator does not have that limitation, same with static classes:

[Abstract]
[BaseType (typeof (NSObject))]
interface UIFeedbackGenerator : UIInteraction {

   [Static]
   [Export ("feedbackGeneratorForView:")]
   UIFeedbackGenerator GetFeedbackGenerator (UIView forView);

   [Export ("prepare")]
   void Prepare ();
}

Which becomes

[BindingType]
public abstract class UIFeedbackGenerator : NSObject, IUIInteraction {

   [Export ("feedbackGeneratorForView:")]
   public static partial UIFeedbackGenerator GetFeedbackGenerator (UIView forView);

   [Export ("prepare")]
   public virtual partial void Prepare ();
}

Because we will be working with classes, we can simply list the interfaces that are implemented and the code generator will write the needed interface methods.

Notes

Class must ALWAYS be partial a Roslyn Analyzer can help avoiding common mistakes and Sharpie will usually be the tool used to write bindings.

Methods and Properties

The new code generation relies heavily on the fact that properties and methods can be partial. The changes for method and properties are minimun, while for constructors we have to do some extra changes.

The bgen binding of methods and properties look like the following:

[BaseType (typeof (UIFeedbackGenerator))]
interface UINotificationFeedbackGenerator {

   [Export ("notificationOccurred:")]
   void NotificationOccurred (UINotificationFeedbackType notificationType);

   [Export ("notificationOccurred:atLocation:")]
   void NotificationOccurred (UINotificationFeedbackType notificationType, CGPoint location);

   [New] // kind of overloading a static member, make it return 'instancetype'
   [Static]
   [Export ("feedbackGeneratorForView:")]
   UINotificationFeedbackGenerator GetFeedbackGenerator (UIView forView);

   [Wrap ("WeakDelegate")]
   [NullAllowed]
   IUISheetPresentationControllerDelegate Delegate { get; set; }

   [NullAllowed, Export ("delegate", ArgumentSemantic.Weak)]
   NSObject WeakDelegate { get; set; }
}

With roslyn we can define the method and properties as follows, pay attention at the fact that we added partial and virtual as needed. The Static and Abstract attributes are not longer needed. Nullability is directly supported, that removes the need of the NullAllowed attribute.

[BindingType]
interface UINotificationFeedbackGenerator {

   [Export ("notificationOccurred:")]
   public virtual partial void NotificationOccurred (UINotificationFeedbackType notificationType);

   [Export ("notificationOccurred:atLocation:")]
   public virtual partial void NotificationOccurred (UINotificationFeedbackType notificationType, CGPoint location);

   [Export ("feedbackGeneratorForView:")]
   public static new partial UINotificationFeedbackGenerator GetFeedbackGenerator (UIView forView);

   public virtual partial IUISheetPresentationControllerDelegate? Delegate { 
      get => WeakDelegate as IUISheetPresentationControllerDelegate;
      set => WeakDelegate = value;
   }

   [Export ("delegate", ArgumentSemantic.Weak)]
   public virtual partial NSObject? WeakDelegate { get; set; }
}

The WeakDelegate removal deserves as special mention. Because we do not longer need the WarpAttribute, the weak delegate property can be written in plain C#, which will be generated by objective sharpie.

Notes

All methods in the SDK are virtual, this will have to be respected in the Microsoft.iOS binding by adding the virtual keyword in the method definition. A roslyn analyzer can help spot this possible common mistake

Constructor

Constructor are a special case of methods, the biggest problem is that a constructor in C# 13 cannot be partial, nevertheless that does not suppose a major problem, since we can declare the init methods as constructors and the Roslyn code generator can create a constructor that will take the same parameters. This init methods should be marked as private and can me marked to be inlined by the compiler.

In bgen constructors are defined as follows:

[BaseType (typeof (UIViewController))]
interface UICloudSharingController {

   [Export ("initWithNibName:bundle:")]
   NativeHandle Constructor ([NullAllowed] string nibName, [NullAllowed] NSBundle bundle);

   [Export ("initWithPreparationHandler:")]
   NativeHandle Constructor (UICloudSharingControllerPreparationHandler preparationHandler);

   [Export ("initWithShare:container:")]
   NativeHandle Constructor (CKShare share, CKContainer container);
}

which becomes:

[BidingType]
public partial class UICloudSharingController : UIViewController {

   [Constructor ("initWithNibName:bundle:", Visibility = ConstructorVisibility.Public)]
   protected NativeHandle InitWithNibName ([NullAllowed] string nibName, [NullAllowed] NSBundle bundle);

   [Constructor ("initWithPreparationHandler:"), Visibility = ConstructorVisibility.Internal)]
   protected NativeHandle InitWithPreparationHandler (UICloudSharingControllerPreparationHandler preparationHandler);

   [Constructor ("initWithShare:container:"), Visibility = ConstructorVisibility.Private)]
   protected NativeHandle InitWithShare (CKShare share, CKContainer container);
}

The above code can easily generate the following constructors for the class:

public partial class UICloudSharingController : UIViewController {

    [BindingImpl (BindingImplOptions.GeneratedCode | BindingImplOptions.Optimizable)]
   protected virtual NativeHandle InitWithNibName ([NullAllowed] string nibName, [NullAllowed] NSBundle bundle) {
      // normal selector code generated
   }

   public UICloudSharingController ([NullAllowed] string nibName, [NullAllowed] NSBundle bundle) {
      InitializeHandle (InitWithNibName (nibName, bundle), "initWithNibName:bundle:");
   }

    [BindingImpl (BindingImplOptions.GeneratedCode | BindingImplOptions.Optimizable)]
   protected virtual NativeHandle InitWithPreparationHandler (UICloudSharingControllerPreparationHandler preparationHandler) {
      // normal selector code generated
   }

   internal UICloudSharingController (UICloudSharingControllerPreparationHandler preparationHandler) {
      InitializeHandle (InitWithPreparationHandler (preparationHandler), "initWithPreparationHandler:");
   }

    [BindingImpl (BindingImplOptions.GeneratedCode | BindingImplOptions.Optimizable)]
   protected virtual NativeHandle InitWithShare (CKShare share, CKContainer container) {
      // normal selector code generated
   }

   private UICloudSharingController (CKShare share, CKContainer container) {
      InitializeHandle (InitWithShare (share, container), "initWithShare:");
   }
}

Notes

Adding more than on constructor with the same parameters will result in a compilation error, this is an issue present already in bgen with a simple workaround that can be implemented as a code generation in a Roslyn analyzer.

Categories

Categories are probably the easiest bindings to port (see migration). They represent a extension class for a specific type. Currently in bgen, as with all other bindings, they are represented as an interface decorated with the Category attribute:

[Category]
[BaseType (typeof (NSAttributedString))]
interface NSAttributedString_NSAttributedStringKitAdditions {
   [MacCatalyst (13, 1)]
   [Export ("containsAttachmentsInRange:")]
   bool ContainsAttachments (NSRange range);
}

Easily converted to a class via:

[BindingType, Category]
public static class NSAttributedStringKitAdditions {
   [Export ("containsAttachmentsInRange:")]
   public static partial bool ContainsAttachments (this NSAttributedString self, NSRange range);
}

The generator can easily generate the partial method that uses the self pointer.

Protocols

Protocols represent interfaces that are implemented in the native world. bgen does not add I in front of protocols which is very confusing to C# developers, but that is done because everything is an interface and later the I will be added in the generated code. This is not longer the need with Roslyn which can define the interface and then in the implementing class add the missing methods.

Old bgen binding:

[Protocol]
interface UIAccessibilityContainerDataTableCell {
   [Abstract]
   [Export ("accessibilityRowRange")]
   NSRange GetAccessibilityRowRange ();

   [Abstract]
   [Export ("accessibilityColumnRange")]
   NSRange GetAccessibilityColumnRange ();
}

new code:

[BindingType, Protocol]
interface UIAccessibilityContainerDataTableCell {
   [Export ("accessibilityRowRange")]
   public NSRange GetAccessibilityRowRange ();

   [Export ("accessibilityColumnRange")]
   public NSRange GetAccessibilityColumnRange ();
}

More importantly we can support optional methods in protocols with the use of default interface methods.

bgen code

[Protocol]
interface UIAccessibilityContainerDataTableCell {
   [Abstract]
   [Export ("accessibilityRowRange")]
   NSRange GetAccessibilityRowRange ();

   [Abstract]
   [Export ("accessibilityColumnRange")]
   NSRange GetAccessibilityColumnRange ();

   [Export ("optionalMethod")]
   bool OptionalProtocolMethod () => default;
}

Roslyn approach:

[BindingType, Protocol]
public partial interface UIAccessibilityContainerDataTableCell {
   [Export ("accessibilityRowRange")]
   public NSRange GetAccessibilityRowRange ();

   [Export ("accessibilityColumnRange")]
   public NSRange GetAccessibilityColumnRange ();

   [Export ("optionalMethod")]
   bool OptionalProtocolMethod () => default;
}

Enums and Smart Enums

Enums perse should be a problem for bgen but they add some compliations to the build:

  1. Some enums but not all have to be present in the CoreBuil, the first step before bgen executes.
  2. bgen copies the enums to the bindings and has to resprect their annotations like [Native] and [Flag].

Moving to a Roslyn based solution solve this problem, there are no two steps in the build process and enums are allways present.

Smart enums are, with categories, the simples biding type to port since they are extension classes. The only work to do is to annotate them with the BindingType attribute.

General

All classed, interfaces and enums that have to be generated will have to be annotated with the BindingType attribute. This annotations is to allow the Roslyn code generator filter the nodes that need to be generated. bgen does not need that since it assumes that is interested in all types, the code generator on the other hand needs a way to filter.

Multi language support

Because Roslyn allow use to separate the understanding of the syntaxt tree and the semantic model from the code generation, moving to this approach will allow to have different backends that can write different types of trampolines opening the door for swift bindings once there is support from the runtime team.

Work items

Moving from bgen to any new implementation is a scary task and we need to be able to do it in a way that if we face a set back, priorities might psuh back, people come and go etc. The good thing of moving to Roslyn is that we do not have to port all binding. Since Roslyn is part of the last step, we can add the code generator to the last compilation, after bgen has generated code and just focus on smaller tasks. A tentative road map would be:

  1. Categories: They are extension classes that are not needed for the bgen generation.
  2. Smart enums: They are extension classes that are not needed for the bgen generation.
  3. Wrap Methods: We can move those methods to the partial classes. No need to do anything.
  4. Methods: Leave just constructors and protocol implementations in bgen move all other methods.
  5. Models: Independent of most things.
  6. Protocols: Can be added before we move constructors.
  7. Constructors: Move all members of a class.

Migration

The migration of the SDK does not need to be done by hand. We can have 3 different tools that will share a lot of knowledge that will help with the migration of the SDK and that can later be a product for customers:

  1. Roslyn code generator: represents the bgen alternative.
  2. Roslyn analyzer: shares some logic with the generator and can help developer find common mistakes.
  3. Migration tool: Performs the 1:1 refactoring of the bindings. There is not impedance mismatch since we are going from a more restrictive model (reflection)
rolfbjarne commented 3 days ago

I like this idea, it can make the bindings a lot simpler, both for us, and particularly for customers. The current binding procedure/API is rather unintuitive.

One potential place where things would get uglier is availability attributes, and how we use them to exclude APIs on certain platforms. Roslyn isn't able to remove stuff from the compilation, so any of the No* attributes won't work, we'd have to use conditional compilation instead:

[BindingType]
public class CBManager : NSObject {
#if __TVOS__
    [SupportedOSPlatform ("tvos13.0")]
    [UnsupportedOSPlatform ("ios")]
    [UnsupportedOSPlatform ("maccatalyst")]
    [UnsupportedOSPlatform ("macos")]
    [Export ("authorization", ArgumentSemantic.Assign)]
    public partial static CBManagerAuthorization Authorization { get; }
#endif
}

Another is the BindAs attribute, we might have to flip it to be BindFrom:

[BindFrom (typeof (NSNumber))]
[Export ("roll", ArgumentSemantic.Strong)]
nfloat? Roll { get; }

I think a good plan forward would be:

  1. Come up with a good name for the roslyn-based generator (rgen? 🙈)
  2. Add a bare-bones roslyn-based generator, that does the very minimal possible to be even remotely useful (probably just support generating a single NSObject-derived class, no methods, no protocols, no nothing), just to get all the plumbing right.
    1. Add it to the build system, so that customer projects automatically get it (note that we don't have to limit ourselves to binding projects, we can enable this for all of our projects).
    2. Add lots of tests & documentation (both end user documentation and technical documentation).
  3. Look in the current api definitions for something that needs a single (simple at first) feature to be implemented, and port that manually to the roslyn-based generator. We can also look at missing APIs from xtro. Here it's important to add features one by one to the roslyn-generator, because otherwise PRs will snowball and likely stall (there are many things we can improve with the current bindings, so tackling them in tiny pieces makes sure we actually get them done).
    1. Binding new APIs might be somewhat slower than porting existing APIs, but it has the advantage that we improve our API coverage at the same time.
    2. On the other hand, binding existing APIs has the advantage that we'll be able to compare the resulting API diff, to make sure the generated API doesn't change.
  4. Once a decent number of features have been implemented, we could look into creating a tool to automatically convert bgen-based api definitions to roslyn-based ones.

Misc notes:

No intellisense. The compiler has no access to the generated class.

This is fixable with bgen; the problem is that design-time builds are disabled.

Enabling design-time builds is easy, but we don't want to run design-time builds remotely from Windows, so then the problem becomes how to not run design-time builds remotely, which turned out to be a rather bigger problem, which I've been working on on and off for quite a while now (related to the next point).

  • Make bgen work from Windows #16611

bgen works fine from Windows already (we run the bgen tests on Windows in fact), the problem is that resources in binding projects don't (which I'm working on fixing). Switching to Roslyn won't make this easier, it's completely unrelated.

mandel-macaque commented 2 days ago

I like this idea, it can make the bindings a lot simpler, both for us, and particularly for customers. The current binding procedure/API is rather unintuitive.

One potential place where things would get uglier is availability attributes, and how we use them to exclude APIs on certain platforms. Roslyn isn't able to remove stuff from the compilation, so any of the No* attributes won't work, we'd have to use conditional compilation instead:

[BindingType]
public class CBManager : NSObject {
#if __TVOS__
  [SupportedOSPlatform ("tvos13.0")]
  [UnsupportedOSPlatform ("ios")]
  [UnsupportedOSPlatform ("maccatalyst")]
  [UnsupportedOSPlatform ("macos")]
  [Export ("authorization", ArgumentSemantic.Assign)]
  public partial static CBManagerAuthorization Authorization { get; }
#endif
}

Another is the BindAs attribute, we might have to flip it to be BindFrom:

[BindFrom (typeof (NSNumber))]
[Export ("roll", ArgumentSemantic.Strong)]
nfloat? Roll { get; }

I consider these problems too in the same exact way, flip BindAs to be from and move to conditional compilation and have a csprof per platform. Should make things a lot easier. We could consider adding an extra step to the linker and mark methods to be fully removed via an attribute.