stride3d / stride

Stride (formerly Xenko), a free and open-source cross-platform C# game engine.
https://stride3d.net
MIT License
6.59k stars 952 forks source link

Roslyn analyzers and source generators (tracking) #1835

Open manio143 opened 1 year ago

manio143 commented 1 year ago

Creating an issue to support cross-PR discussion for #1778 and #1609

The purpose is to slowly replace AssemblyProcessor with Roslyn for the benefit of easier maintenance and improved usability by allowing debugging into the serializers. Furthermore analyzers would provide compile time warnings to users if they do something not supported by the serialization system providing more clarity over why things are not working.

manio143 commented 1 year ago

Some notes from looking at code

Member visibility

public - serializers can access it easily ✅ internal - serializers are generated in the same assembly as the class and can access it easily ⚠ private/protected - unavailable to binary serializer (pre .NET 8 UnsafeAccessor), but available to reflection-based YAML serializer

Read-only/write-only

Binary serialization doesn't support write only members - has to have a getter. Read only members must be of a reference type (and not a string). Readonly fields are supported, as well as properties without a setter.

Virtual members

Virtual properties are serialized given the class declaring them is serializable. Overrides can be serialized specifically only if annotated with [DataMember].

DataMember attribute

Mode

The default mode for all members can be also set on [DataContract].

Order

Used in Editor to decide order of properties. Possibly also in YAML serialization.

Name

The Name is used to override member name in YAML serialization and for the Editor. Has no effect on binary serialization.

Mask

Used in YAML serialization to filter out members not covered by the bit mask. One use case in repo in dead code.

Delegates

Delegates cannot be serialized.

DataMemberUpdatable Attribute

The UpdateEngine (dynamic object updater based on string paths for complex animations over components) somehow uses serialization metadata and if a member has [DataMemberIgnore] but also [DataMemberUpdatable] then it is binary serializable but not visible in YAML/Editor.

Fixed buffers

Currently AssemblyProcessor generated binary serializers do not support fixed buffer fields (msdn).

xen2 commented 1 year ago

I didn't know about UnsafeAccessor, that's quite handy!

xen2 commented 1 year ago

SiliconStudio.AssemblyProcessor-roslyn.zip I have attached an extremely old version (2018) of the assembly processor that was using T4 templates to generate serializer C# code in a .cs file, and used Roslyn to compile it.

This was before Source generators were a thing, so we called Roslyn API from the assembly processor and then merging it with Cecil. We ended up ditching it because it was too slow on every assembly, esp. since it was always a new AssemblyProcessor process which needed to initialize Roslyn for every assembly to process (nowadays it's not a problem anymore when using source generator, Roslyn is already preloaded as part of csc).

Maybe some part of the C# code generator might be useful.

Ethereal77 commented 1 year ago

I made as an experiment some time ago a Source Generator that did the same as the Stride's [AssemblyScan] attribute and processor to populate a list of types with specific traits for easy runtime querying,

I can look for it if it can be useful.

manio143 commented 1 year ago

Diagnostic naming convention

I would propose longer names which would note where a diagnostic is coming from

For analyzers it's good practice to have one DiagnosticAnalyzer class per diagnostic and name them accordingly, e.g. STRDIAG001InaccessibleMember.cs

Analyzers

We want to provide a uniform set of rules between all Stride serialization platforms.

The analyzers should warn users when being explicit with intent. For example a private member of a serializable class should not warn unless attributed with [DataMember]. However, if user intent is implicit the warning should be issued. For example a public dictionary member on a serializable class with a complex key should warn about the incorrect key.

Inaccessible Member

Given a [DataMember] when member is not accessible (private/protected) emit a warning.

[DataMember]
private string Priv { get; set; }

Member with no Getter

Given a [DataMember] when member has no getter emit a warning.

[DataMember]
public string Sink { set { } }

Read-only members must be a reference type

Given a [DataMember] when member is readonly or { get; } and is of value type or string emit a warning.

[DataMember]
public string Label { get; } = "example";

[DataMember]
public readonly Vector3 Position;

Invalid DataMember mode

Given a [DataMember] with explicit mode, if the mode is incompatible with member emit a warning.

[DataMember(DataMemberMode.Assign)]
public List<int> Tags { get; } = new();

[DataMember(DataMemberMode.Content)]
public readonly MyStruct Mine;

Delegates cannot be serialized

Given a [DataMember] on a delegate type emit a warning.

[DataMember]
public Action Callback;

Fixed buffers are currently not supported

Given a field on a serializable struct with fixed keyword emit a warning.

[DataContract]
public struct S
{
  public fixed char name[30];
}

Mismatching intent

Given both [DataMember] and [DataMemberIgnore] emit a warning, unless [DataMemberUpdatable] is also present.

[DataMember]
[DataMemberIgnore]
public IEnumerable<X> Collection;

Dictionary key cannot be a complex type

Given a serializable member of type IDictionary<TKey, TValue> if TKey is not a basic value (number, enum, string) emit a warning.

public Dictionary<Vector2, int> Weights { get; set; }
Ethereal77 commented 1 year ago

This is the "[ModuleInit] with order" experiment I talked about. InitializeAtStartup-SourceGen

Ethereal77 commented 1 year ago

I'm also looking for the [AssemblyScan] generator I talk about in a previous comment. I can't find it but I'll continue trying in my old projects folder.

Meanwhile, I've found a .txt where I wrote the process the generator goes by when analyzing the types and registering them in AssemblyRegistry.

1. Assembly.GetAllTypes()
   1.1. Ignore interfaces
   1.2. Ignore generic types

2. Foreach main type
   2.1. Scan attributes in interfaces
   2.2. Scan attributes in the type
   2.3. Look for base types. Goto 2.1

3. Foreach attribute in each type
   3.1. If type has AssemblyScan, register type
   3.2. Foreach attribute in the attribute
        3.2.1. If attribute has AssemblyScan, mark it
        3.2.2. Else, if attribute is not inherited
               3.2.2.1. If we are processing a base type (scanType) != main type, skip the attribute. End 3.2
        3.2.3. If attribute was marked, register the attribute for the main type (type)

4. Foreach main type to register
   4.1. If nested, and if neither Public nor Internal, error.
   4.2. Register the base type (scanType) for the main type (type)

5. If any ScanTypes found
   5.1. Create a Dictionary<Type, List<Type>>
   5.2. Foreach registered type
        5.2.1. Create a List<Type> for the scanTypes associated with the type
        5.2.2. Foreach scanType associated with type
               5.2.2.1. Add the scanType to the list
        5.2.3. Add the list to the dictionary for type
   5.3. Create a ScanType with the dictionary
   5.4. Register that ScanType with the AssemblyRegistry

I hope I can find the project. I'll continue searching...

Ethereal77 commented 1 year ago

I fear I haven't found the old project for [AssemblyScan] source generator. Don't know where I've placed it or why is not where my other experiments are.

However, I remember the hardest part was understanding what the current AssemblyProcessor does. Replicating it in a source generator was tedious, but easy. The notes I've recovered and written in the above comment summarizes that process pretty well.

As an aside, I had a note in one place where I wondered if the attribute name AssemblyScan should be changed to something more meaningful, as it doesn't convey that it annotates a type that can be queried for info later in the AssemblyRegistry.

manio143 commented 1 year ago

Thank you so much @Ethereal77 for the notes. I agree - most often figuring out what is actually being done is the hard part.

Ethereal77 commented 1 year ago

While I'm in the process of integrating the AssemblyProcessor in the main solution, I've noticed you have added the new Stride.Core.CompilerServices in the 10-CoreRuntime category. I think that category is meant for "core" libraries of the runtime. I see a set of analyzers, diagnosers, and source generators more as support tooling. I myself have added AssemblyProcessor to the 90-Tools category for this very reason.

I've thought about adding a new category after 00-Targets.Private, but before 10-CoreRuntime as an alternative. Something like 00-CoreTools. Or move the numbers up a bit so it is 10-CoreTools and the core libraries are 20-CoreRuntime.

What do you think?

manio143 commented 1 year ago

Good question. There's already a bit of a mix of things between how the VS solution is organized vs how the projects are laid out on disk. I'm fine with analyzers/generators being marked as build tools, same as assembly processor currently. They just need to live close to Stride.Core as they are logically linked with it.