Dreaming381 / Latios-Framework

A Unity DOTS framework for my personal projects
Other
947 stars 86 forks source link

Core: IAutoEnumerator Source Generator #42

Open Dreaming381 opened 4 months ago

Dreaming381 commented 4 months ago

IAutoEnumerator Source Generator

Iterator methods (methods that contain yield return expressions) are an expressive way to describe sequences of logic spread over a period of time, a complex data structure, or decision tree. However, such methods are not compatible with Burst. Let’s fix that!

Task is Prerequisite For

Base Requirements

Additions and modifications for the source generator should be made to this repository.

Given the user-written code:

// Struct can be named anything
partial struct UserDefinedEnumeratorStruct : IAutoEnumerator
{
    // Can be public, internal, or private/absent-access-specifier
    // SomeType can be any Burst-compatible type
    // The method name must be AutoEnumerate, and it can have 0
    // or more arguments of any combination of types
    public IEnumerator<SomeType> AutoEnumerate(SomeArgument someArgument)
    {
        // Implementation with yield returns...
    }
}

Generate the following using an incremental source generator:

partial struct UserDefinedEnumeratorStruct : IAutoEnumerator
{
    // Private member fields all prefixed with at least two underscores ...

    // Must be public, can be an expression-bodied method
    public SomeType Current
    {
        get
        {
            // Some implementation ...
        }
    }

    // Must be public
    public bool MoveNext()
    {
        // Some implementation ...
    }

    // Must have the same access specifier as AutoEnumerate
    // and must have the same arguments as AutoEnumerate
    public UserDefinedEnumeratorStruct GetEnumerator(SomeArgument someArgument)
    {
        var __result = this;
        __result.SetEnumerator(someArgument);
        return __result;
    }

    // Must have the same access speicifier as AutoEnumerate
    // and must have the same arguments as AutoEnumerate
    public void SetEnumerator(SomeArgument someArgument)
    {
        // ... Some implementation
    }
}

The generator must be an IIncrementalGenerator.

The generator should produce error messages using the same mechanism as the existing CollectionComponentGenerator and ManagedStructComponentGenerator use.

Introductory logic can exist either in SetEnumerator or MoveNext. User code between invocations of SetEnumerator and MoveNext which interact with any container arguments of SetEnumerator is considered undefined behavior.

Any Entity, BlobAssetReference<>, or UnsafeUntypedBlobAssetReference must be preserved as an explicit field such that Unity can serialize the expanded struct.

Unless the user code is marked unsafe, the generated code may not use any unsafe semantics. If unsafe code is needed for the implementation apart from user code, it can be implemented with utility methods inside the Latios Core module within Latios.InternalSourceGen.StaticAPI.

IAutoEnumerator interface should be defined in the Core module inside the GameplayToolkit directory.

Example

This is a basic example of what an implementation might produce. Note that there are potentially multiple correct implementations.

Input:

partial struct DivideByTwoInEachEnumerator : IAutoEnumerator
{
    public IEnumerator<float> AutoEnumerate(NativeArray<float> array)
    {
        foreach (var item in array)
        {
            yield return item / 2f;
        }
    }
}

Output:

partial struct DivideByTwoInEachEnumerator
{
    private NativeArray<float> __array;
    private NativeArray<float>.Enumerator __foreach_0_enumerator;
    private float __item;
    private int __nextJumpLabel;
    private float __current;

    public float Current => __current;

    public bool MoveNext()
    {
        switch (__nextJumpLabel)
        {
            case 0:
            {
                __foreach_0_enumerator = __array.GetEnumerator();
                __nextJumpLabel = 1;
                goto case 1;
            }
            case 1: // __foreach_0_begin
            {
                if (!__foreach_0_enumerator.MoveNext())
                {
                    __nextJumpLabel = 3;
                    goto case 3;
                }
                __item = __foreach_0_enumerator.Current;
                __current = __item / 2f;
                __nextJumpLabel = 2;
                return true;
            }
            case 2: // yield_0_resume
            {
                __nextJumpLabel = 1;
                goto case 1;
            }
            case 3: // __foreach_0_end
            {
                return false;
            }
            default:
                return false;
        }
    }

    public DivideByTwoInEachEnumerator GetEnumerator(NativeArray<float> array)
    {
        var __result = this;
        __result.SetEnumerator();
        return __result;
    }

    public void SetEnumerator(NativeArray<float> array)
    {
        __array = array;
    }
}

In this implementation, every local variable becomes a member variable. And every loop start, end, and yield creates a new case. An alternate implementation would be to copy local variables to members for the given scope, and then restore them so that the original lines of code could be copied as-is.

Optimizations

Optimization is not required, but encouraged if you are able to.

When optimizing, the size of the expanded enumerator struct is what matters above all else. It is desirable to make it as small as possible.

Not all local variables need to be converted to member fields. If the variable’s lifetime does not stretch across a yield statement, it can be kept local. In the example above, __item could be kept local.

There is no strict definition as to whether initial logic is in SetEnumerator or MoveNext. Consequently, in the example above, __array could also be eliminated by moving some of the logic into SetEnumerator.

In more complicated enumerations, it may be possible to recycle member fields.

Member field ordering can have an impact on the size of the structure, as fields have to be placed according to their alignment, which can sometimes result in padding being inserted between fields. For example, the sequence int, int, int, short, bool, bool, is 8 bytes smaller than the sequence int, bool, int, short, int, bool.

laicasaane commented 4 months ago

To sum up the discussion I had with CyrusNajmabadi on C# server on Discord, we have to reimplement this part of Roslyn and maybe some related parts too. I put the link here for further reference. https://github.com/dotnet/roslyn/tree/main/src/Compilers/CSharp/Portable/Lowering/IteratorRewriter

Dreaming381 commented 4 months ago

That is one way to go about it. But keep in mind that we do not need to support a few things that Roslyn's iterator rewriter supports, which may simplify things. First, we don't need to support recursion. In fact, we don't want to, because recursion requires an unknown amount of stored state, and we need to keep the state fixed-sized. Second, we don't need robust exception handling for things like Try/Finally and such, since we intend for this to run in Burst anyways.