Coroutines (and pooling) for v4, with improved tests

AdamsLair / duality

a 2D Game Development Framework

https://adamslair.github.io/duality

MIT License

1.41k stars 286 forks source link

Coroutines (and pooling) for v4, with improved tests #801

Closed SirePi closed 4 years ago

SirePi commented 4 years ago

This PR supersedes #755 as it is based on the most recent v4 master source and implements all reviews and user request of the previous pull request. In addition, pool-specific tests have been introduced and Scene.cs has a couple of missing "this" added to comply with the overall coding style.

Barsonax commented 4 years ago

Some more points:

A coroutine sample project would also be nice I believe, not just for the enduser but also for us to see how the API will look like and be used. Might inflate the current scope of the PR quite a bit though..
I see that CoroutineManager uses static state (private static QueuedPool<Coroutine> pool = new QueuedPool<Coroutine>();). Might be safer to switch to a instance based approach as that makes it more clear what the lifetime exactly is. This also makes it easier to test since you are not sharing data between tests anymore (especially when using a parallel test runner this is super annoying). I know duality uses static state in alot of places but I think we should try to avoid that when possible.

SirePi commented 4 years ago

Condition / yield instruction classes have a rather unspecific name (to improve usability?),

Ideas?

A coroutine sample project would also be nice I believe, not just for the enduser but also for us to see how the API will look like and be used

Can be, I can provide one externally from the PR just to give you an idea and, in a second moment, extend it to a full Sample

ilexp commented 4 years ago

Condition / yield instruction classes have a rather unspecific name (to improve usability?),

Ideas?

Not really, but we could move the entire coroutines feature to the Duality.Coroutines namespace, lifting a bit of naming pressure. It would still be just WaitFrames, but only when people plan to use a coroutine in that file - would make sense to me.

Alternatively, we could choose a really specific name like WaitFramesCoroutineCondition and provide a static(?) construction class to create them, like:

yield return CoroutineWait.ForFrames(15);
yield return CoroutineWait.ForSeconds(100.0f);
yield return CoroutineWait.Until(() => foo.Bar() == 17);

instead of

yield return new WaitFrames(15);
yield return new WaitTime(100.0f);
yield return new WaitCondition(() => foo.Bar() == 17);

Would not permit to add new conditions as first-class citizens though, since users cannot extend the static class.

Edit: Yeah, not sure I like this idea a lot the more I think about it. It's always nice when users can add their own stuff exactly like whatever was builtin, and a lot of design decisions in Duality point that way already. Nevermind then!

We could also just do the namespace thing, just throwing ideas at the wall here.

Pooling data structures should be reviewed and considered separately and very thoroughly, since this is potentially common infrastructure.

Just looked into the pooling classes here, and it seems the only thing that is pooled is Coroutine itself, with IPoolable being completely unused. I'm honestly not sure that the internal pooling of Coroutine is enough to introduce this whole pooling utility class stuff, which also increases the scope of this PR a lot due to the added design work.

Can we just remove the pooling classes entirely and use a simple List<Coroutine> as an internal pool instead?

A coroutine sample project would also be nice I believe, not just for the enduser but also for us to see how the API will look like and be used

Can be, I can provide one externally from the PR just to give you an idea and, in a second moment, extend it to a full Sample

I'd keep a sample project in a separate PR to not inflate scope too much here, and also since it's not an integral part of the feature. The existing unit tests can also provide a basic example on how a coroutine is used.

SirePi commented 4 years ago

Not really, but we could move the entire coroutines feature to the Duality.Coroutines namespace, lifting a bit of naming pressure. It would still be just WaitFrames, but only when people plan to use a coroutine in that file - would make sense to me.

well, they are already in a Duality.Utility.Coroutines namespace.. I don't see what would be different but if you prefer so, it's a quick fix 😄 Maybe instead of IWaitCondition it could be something like ICoroutineTask? would have an affinity with .net's Task system (Task.Delay => CoroutineTask.WaitFor, and so on?)

Alternatively, we could choose a really specific name like WaitFramesCoroutineCondition and provide a static(?) construction class to create them, like:
yield return CoroutineWait.ForFrames(15);
yield return CoroutineWait.ForSeconds(100.0f);
yield return CoroutineWait.Until(() => foo.Bar() == 17);
instead of
yield return new WaitFrames(15);
yield return new WaitTime(100.0f);
yield return new WaitCondition(() => foo.Bar() == 17);
Would not permit to add new conditions as first-class citizens though, since users cannot extend the static class.

And with the WaitCondition<T>.Reset method, you don't even need to instantiate a new condition each time you use the Coroutine.

Edit: Yeah, not sure I like this idea a lot the more I think about it. It's always nice when users can add their own stuff exactly like whatever was builtin, and a lot of design decisions in Duality point that way already. Nevermind then!

Just looked into the pooling classes here, and it seems the only thing that is pooled is Coroutine itself, with IPoolable being completely unused. I'm honestly not sure that the internal pooling of Coroutine is enough to introduce this whole pooling utility class stuff, which also increases the scope of this PR a lot due to the added design work.

Can we just remove the pooling classes entirely and use a simple List<Coroutine> as an internal pool instead?

I'd like to keep them together if possible; I know it goes a bit out of scope for this PR, but I think a pooling system could be useful as a general feature of the engine, and I would implement an internal one anyway, so why not just make it once properly, and available for general use? I don't see a rush to have all of this available for v4's first release, in case. It could very well be a 4.1.

I'd keep a sample project in a separate PR to not inflate scope too much here, and also since it's not an integral part of the feature. The existing unit tests can also provide a basic example on how a coroutine is used.

Hence the "external" part 😄 my idea is to provide a small demo project as a zip file, to try and test the usage of the various parts of the solution, separate from this PR. Only once everything is set, a proper Sample project would me made, eventually starting from the demo project.

ilexp commented 4 years ago

I'd keep a sample project in a separate PR to not inflate scope too much here, and also since it's not an integral part of the feature. The existing unit tests can also provide a basic example on how a coroutine is used.

Hence the "external" part 😄 my idea is to provide a small demo project as a zip file, to try and test the usage of the various parts of the solution, separate from this PR. Only once everything is set, a proper Sample project would me made, eventually starting from the demo project.

Understood, and I actually meant to just agree with the separated approach here 👍

well, they are already in a Duality.Utility.Coroutines namespace.. I don't see what would be different but if you prefer so, it's a quick fix 😄

It says namespace Duality in all the coroutine files? 😄

I'd like to keep them together if possible; I know it goes a bit out of scope for this PR, but I think a pooling system could be useful as a general feature of the engine, and I would implement an internal one anyway, so why not just make it once properly, and available for general use? I don't see a rush to have all of this available for v4's first release, in case. It could very well be a 4.1.

We can keep them together if you prefer, it's just that the margin for me approving general-purpose pooling utility is a lot thinner.

I'm a bit wary of pooling as a generalized technique (e.g. "just pool all the stuff, never worry about GC / memory again") that we might accidentally promote with generalized utility classes here, and it's also a very fundamental framework thing where I think we should be extra strict in design, implementation and review. Coroutines alone would probably a lot faster in a state where I'd give a thumbs up.

And with the WaitCondition.Reset method, you don't even need to instantiate a new condition each time you use the Coroutine.

Yeah, I was thinking something similar, that part would be neat. Then again, locking out or discouraging third-party conditions because they can't extend static API is a big downside and it would also mean introducing overhead for managing pooled instances.

There might be another way to avoid allocating new condition instances all the time: Make them (discriminated union style?) structs, e.g.

[StructLayout(LayoutKind.Explicit, Size=16)]
public struct WaitCondition
{
    [FieldOffset(0)] private WaitConditionType type;
    [FieldOffset(4)] private int intParam;
    [FieldOffset(4)] private float floatParam;
    [FieldOffset(4)] private Func<bool> conditionParam;

    // proper public API, constructors, etc.
}

// Coroutines would then return an IEnumerable<WaitCondition>

That way, we would still cover all currently implemented cases, keep it allocation-free for "wait for time / frames" conditions and also remain extensible by letting users specify an arbitrary func to evaluate.

SirePi commented 4 years ago

It says namespace Duality in all the coroutine files? 😄

DOH! Yeah, I followed the style of other Utility files, totally forgot about the actual namespace

I'm a bit wary of pooling as a generalized technique (e.g. "just pool all the stuff, never worry about GC / memory again") that we might accidentally promote with generalized utility classes here, and it's also a very fundamental framework thing where I think we should be extra strict in design, implementation and review. Coroutines alone would probably a lot faster in a state where I'd give a thumbs up.

IMHO it doesn't promote anything more than, let's say, ThreadPool does for managing your own Threads. It's a feature that is there, if you want to use it, go ahead (being mindful of what it means and what it does), if not, no one is pushing it; if you think it's overkill though, I can just remove it or change it as an internal class instead of a public one. OFC it must be tested, thought out if it's not already, etc.

Yeah, I was thinking something similar, that part would be neat. Then again, locking out or discouraging third-party conditions because they can't extend static API is a big downside and it would also mean introducing overhead for managing pooled instances.

There might be another way to avoid allocating new condition instances all the time: Make them (discriminated union style?) structs, e.g.
[StructLayout(LayoutKind.Explicit, Size=16)]
public struct WaitCondition
{
    [FieldOffset(0)] private WaitConditionType type;
    [FieldOffset(4)] private int intParam;
    [FieldOffset(4)] private float floatParam;
    [FieldOffset(4)] private Func<bool> conditionParam;

    // proper public API, constructors, etc.
}

// Coroutines would then return an IEnumerable<WaitCondition>
That way, we would still cover all cases, keep it allocation-free for "wait for time / frames" conditions and also remain extensible by letting users specify an arbitrary func to evaluate.

Ok, I got lost here 🤔 as things are now, everything is already extensible, either directly from IWaitCondition, WaitCondition, or WaitCondition<T>; Plus, define "all cases".. this only covers ints and floats parameters, while now you could have any kind of object as T, and don't forget that yield return null is a valid construct, which would not work anymore with structs, unless we have a static ONE_FRAME value available.

ilexp commented 4 years ago

Ok, I got lost here 🤔 as things are now, everything is already extensible, either directly from IWaitCondition, WaitCondition, or WaitCondition;

Yep, they are - I just meant that they would still be with the struct-based design. The goal here wouldn't be to make it extensible, but to avoid allocating a new object on every yield.

Plus, define "all cases".. this only covers ints and floats parameters, while now you could have any kind of object as T, and don't forget that yield return null is a valid construct, which would not work anymore with structs, unless we have a static ONE_FRAME value available.

We currently have waiting for a number frames, waiting for a time, and waiting for a condition, all three of which would work with this. And the condition case (in both designs) is generic enough for users to specify arbitrary waiting conditions.

They don't need to use any of the builtin parameter types either (they're probably not exposed anyway), since the Func<bool> allows capturing whichever parameters they want. The builtin parameter types are just there so the probably most used buitin conditions (waiting for a time or for number of frames) don't allocate anything on their own.

Returning null wouldn't work anymore, but that might not be that bad - because instead, we're now forced to return explicitly what we want, e.g. wait for one frame. Likely via static readonly field like you said, unless the user prefers new WaitCondition() or default(WaitCondition).

If you compare the following two:

yield return null; // ???

yield return WaitCondition.OneFrame; // Aha!

It's more verbose, but also less "arbitrary". The first one only makes sense if you know what it does already, but it's fine because people know Unity and Unity does it like that. The second one essentially tells you what it does by reading, but it might be more annoying to write.

But again, I'm currently only throwing ideas around. This is not a finished design or proven concept, it's just something we can look into to avoid allocations where possible.

IMHO it doesn't promote anything more than, let's say, ThreadPool does for managing your own Threads. It's a feature that is there, if you want to use it, go ahead (being mindful of what it means and what it does), if not, no one is pushing it; if you think it's overkill though, I can just remove it or change it as an internal class instead of a public one. OFC it must be tested, thought out if it's not already, etc.

For this PR, I think a List really would be enough, as nothing here implements IPoolable, and without that, the queued pool doesn't really do much more than that - so personally, I'd probably just do without. For simplicity's sake, and also because the pooling thing just has a lot of discussion potential, and I don't want to mix that up with the coroutines feature if it can be avoided.

SirePi commented 4 years ago

I guess I'm dense 😅 , or I have been using this architecture for myself too long and can't see the bigger picture anymore, but I don't understand how a condition can evolve without the relative update method in the struct definition.

i.e. Let's say I end up with a

yield return WaitCondition.ForSeconds(2.5);

who is in charge of counting the time? the condition itself? the CoroutineManager? based on what data, if everything is private?

Just as an example, this is something I'm currently using:

this.flashCondition = new WaitCondition<float>(
  (f) =>
  {
    float elapsedTime = f + Time.DeltaTime;

    float delta = elapsedTime / FLASH_TIME;
    this.flashSprite.SpriteTint = this.flashColor.WithAlpha(1f - delta);
    this.flashSprite.Position = this.sprite.Position - FLASH_DELTA * delta;
    this.flashSprite.Size = this.sprite.Size + FLASH_DELTA * delta * 2;

    return elapsedTime;
  },
  (f) => f >= FLASH_TIME,
  0);

How would you see this in the struct? (I like the struct idea, just want to be sure it fits 😄 )

Barsonax commented 4 years ago

If you really want to avoid allocations then we also have to think about the closure in the delegate that captures any variables from the outer scope. I think you should pass in the object you want to use (the gameobject for instance) as a parameter instead of using this in the delegate. Basically you are taking control of the closure yourself but you can also avoid allocations on every call like this.

I think its possible to make the condition a struct and make it allocation free without losing extensibilty if we use generics. If its a generic method the clr will actually compile a completely separate method with the correct parameters if the type parameter is a struct, thus avoiding the boxing overhead you normally get. We would have to make everything that uses IWaitCondition generic though.

ilexp commented 4 years ago

i.e. Let's say I end up with a
yield return WaitCondition.ForSeconds(2.5);
who is in charge of counting the time? the condition itself? the CoroutineManager? based on what data, if everything is private?

The ForSeconds method would instantiate a new WaitCondition (struct) of type (let's say) WaitConditionType.SecondsPassed and set the float parameter to the current timestamp. To check whether the condition is met, the timestamp is compared against the current time, which is done probably in an internal WaitCondition.Evaluate (or so) instance method, invoked by whichever code manages the coroutine progression.

That method would switch on the condition type and knows what to do and which parameters to read and interpret. User-defined conditions use the Func<bool> condition parameter, which is just invoked blindly.

based on what data, if everything is private?

The outline above was super incomplete, I just wanted to illustrate what kind of data it would store and how it could store it to give you an idea. There is probably a non-private Evaluate method or similar that checks the condition - or not! Would have to be designed / prototyped.

Just as an example, this is something I'm currently using:

this.flashCondition = new WaitCondition<float>(
  (f) =>
  {
    float elapsedTime = f + Time.DeltaTime;

    float delta = elapsedTime / FLASH_TIME;
    this.flashSprite.SpriteTint = this.flashColor.WithAlpha(1f - delta);
    this.flashSprite.Position = this.sprite.Position - FLASH_DELTA * delta;
    this.flashSprite.Size = this.sprite.Size + FLASH_DELTA * delta * 2;

    return elapsedTime;
  },
  (f) => f >= FLASH_TIME,
  0);

How would you see this in the struct? (I like the struct idea, just want to be sure it fits 😄 )

Wait, why does the condition do anything besides check if it's met or not? Shouldn't it be the Coroutine updating sprites, not the wait condition?

SirePi commented 4 years ago

Wait, why does the condition do anything besides check if it's met or not? Shouldn't it be the Coroutine updating sprites, not the wait condition?

Yeah, I admit that this example is bending a bit the "wait" part of the construct, and I guess that's what made me focus on the more complex generic behaviors.

After this, I believe that the whole thing can be simplified to something like this:

public struct WaitFor
{
  public static readonly WaitFor NextFrame = new WaitFor ((_) => 0, (_) => true, 0);

  private float internalValue;
  private readonly Func<float, float> updateValue;
  private readonly Func<float, bool> exitCondition;

  public WaitFor(Func<float, float> updateValue, Func<float, bool> exitCondition, float startingValue)
  {
    this.internalValue = startingValue;
    this.updateValue = updateValue;
    this.exitCondition = exitCondition;
  }

  public bool Update()
  {
    this.internalValue = this.updateValue(this.internalValue);
    return this.exitCondition(this.internalValue);
  }

  public static WaitFor Frames(int frames)
  {
    return new WaitFor((i) => i + 1, (i) => i >= frames, 0);
  }

  public static WaitFor Seconds(float seconds)
  {
    return new WaitFor((f) => f + Time.DeltaTime, (f) => f >= seconds, 0);
  }

  public static WaitFor TimeSpan(TimeSpan timeSpan)
  {
    return WaitFor.Seconds((float)timeSpan.TotalSeconds);
  }
}

which brings the previous example to

private IEnumerable<WaitFor> MyCoroutine(...)
{
  ...

  float elapsedTime = 0;
  while (elapsedTime < FLASH_TIME)
  {
    elapsedTime += Time.DeltaTime;
    float delta = MathF.Clamp01(elapsedTime / FLASH_TIME);

    // do things for this frame until done, for an amount of time in this case, 
    // but it could be any other condition (moved far away enough, 
    // wait for another variable to change, etc...)
    ...

    yield return WaitFor.NextFrame;
  }
  ...
}

how do you think?

Barsonax commented 4 years ago

Thinking about the coroutines it would be nice if it automatically (or optionally) spreads out the updates over multiple frames.

For instance you have alot of instances that have to update something every x frames. For frametime reasons you don't want those updates to all end up in the same frame, causing the frametime to spike. What you want is that it updates 1/x instances every frame:

Scenario 10k instances 1 update every 10 frames

Old: frame 1: updates 0 instances frame 2: updates 0 instances .... frame 10: updates 10000 instances (stutter!)

New: frame 1: updates 1000 instances frame 2: updates 1000 instances frame 3: updates 1000 instances etc...

I think this is a key feature for performance. Ofcourse the above is a simple scenario with a constant wait time so we have to think how handle cases with a variable wait time as well and what guarantees we will give to the enduser.

SirePi commented 4 years ago

I think this is a key feature for performance. Ofcourse the above is a simple scenario with a constant wait time so we have to think how handle cases with a variable wait time as well and what guarantees we will give to the enduser.

I was thinking, about this point, that maybe there could be another job system in place, in addition to the coroutines one, in charge of dealing with time-consuming or long-running processes, such as the 10k instances update above. As it is now, the Coroutines system is designed to perform potentially frame-synchronized operations (there's nothing you can do with a Coroutine that you couldn't do with some fancy OnUpdate coding)

Just throwing out ideas now, but maybe it could be possible to see this system as a separate thread, so that it won't bother the main engine even with thousands of entities to process, potentially leveraging parallel libraries to use the CPU more efficiently (of course this means that this system would be strictly reserved for computation), and give back the results when they are ready

Something like

MAIN ENGINE THREAD                      JOBS THREAD
 Standard loop                           nothing
 ...                                     ...
 Job j = Jobs.StartNew(10kProcess)       Start, keep track of returned reference
 if(j.GetResult() != null)...            Processing
 if(j.GetResult() != null)...            Processing
 if(j.GetResult() != null)...            End, thisJob.Result = ...
 if(j.GetResult() != null)...            ...
 ...                                     ...

where GetResult would return the data and clear it from the job's instance, so that another call in the next frame won't return the same results

So, in the case of Barsonax, it would just be a matter of launching the job on the 10.000 entities, wait until it's done, and then decide if restart it right away, wait a little, etc..

Ideally the StartNew or equivalent method would allow a parameter that says if this job has to be done only once, or repeated indefinitelly, there should be a way to cancel it, maybe a way to feed it new parameters?

Just my first 0.02€ on this topic

Barsonax commented 4 years ago

Hmm but with a actual separate thread you escalate the complexity alot due to synchronization. You cannot really call into any duality API from that job.

It has it use cases surely but its completely different from a coroutine on the same thread thats efficiently scheduled to spread out the load. .NET already has something for with the task library. The thing is currently you run into differences between the editor and the launcher but thats a different story.

SirePi commented 4 years ago

Yeah, the idea for the separate thread is to have something that can deal with tasks that do not require a strict frame-by-frame synchronization (pathfinding, AI) and can report back "asynchronously", potentially providing a kind of a framework that masks the low level workings (i.e. keep the jobs thread always active instead of spooling a new one every time).

On the other hand, to stay within the Coroutines system, I put together a kind of a prototype, which could look like this

public static Coroutine ForEach<T>(Scene scene, IEnumerable<T> collection, Action<T> action, int cycleFrames, string name = null)
{
  return Start(scene, OffsetExecution(collection, action, cycleFrames), name);
}

private static IEnumerable<WaitCondition> OffsetExecution<T>(IEnumerable<T> collection, Action<T> action, int cycleFrames)
{
  int sliceSize = (collection.Count() / cycleFrames) + 1;

  // split in equal chunks 
  T[][] chunks = Enumerable
    .Range(0, cycleFrames)
    .Select(i => collection.Skip(i * sliceSize).Take(sliceSize).ToArray())
    .ToArray();

  int cycle = -1;
  while(true)
  {
    cycle = (cycle + 1) % cycleFrames;

    for (int i = 0; i < chunks[cycle].Length; i++)
      action(chunks[cycle][i]);

    // Would stuff like this be possible?
    // Parallel.ForEach(blocks[cycle], action);

    yield return null;
  }
}

Maybe with an extra parameter to say if the cycle should repeat indefinitely or not

Barsonax commented 4 years ago

Yeah, the idea for the separate thread is to have something that can deal with tasks that do not require a strict frame-by-frame synchronization (pathfinding, AI) and can report back "asynchronously", potentially providing a kind of a framework that masks the low level workings (i.e. keep the jobs thread always active instead of spooling a new one every time).

Yeah I think you want something like this as well. Its separate from the coroutine system though. .NET already has the task library which we should leverage for this. There already is a issue for this #603

On the other hand, to stay within the Coroutines system, I put together a kind of a prototype, which could look like this

I think Ill have to checkout the code myself and play a bit with it. Have to check these points:

What happens when coroutines end? Lets say all coroutines in the same cycle end you would end up with a smaller cycle compared to the others. Not sure if this could be solved as 'rebalancing' this is impossible without moving other updates. Maybe we have to make a distinction here between coroutines which are allowed to be moved backwards or forwards for optimization purposes and coroutines which are not allowed to because it breaks correctness.
The code has to be fast and avoid allocations, as beautiful as LINQ is its not suited in this regard.

Sounds like we are gonna have to make a coroutine scheduler that handles this for us which could be quite complex, maybe better to do that in a separate PR? I still think its good to think of it now as it might have a impact on the public API.

// Would stuff like this be possible?
// Parallel.ForEach(blocks[cycle], action);

No this would execute the action on a different thread. It has to happen on the main thread.

ilexp commented 4 years ago

Wait, why does the condition do anything besides check if it's met or not? Shouldn't it be the Coroutine updating sprites, not the wait condition?

Yeah, I admit that this example is bending a bit the "wait" part of the construct, and I guess that's what made me focus on the more complex generic behaviors.

After this, I believe that the whole thing can be simplified to something like this:
public struct WaitFor
{
  [...]
}
which brings the previous example to
private IEnumerable<WaitFor> MyCoroutine(...)
{
  [...]
}
how do you think?

I like the new WaitFor design and the example is exactly what I had in mind as the typical use case - internally, we can simplify this even further by unifying the update and exit condition funcs into just a generic "evaluation" function that returns whether it's done and can do whatever otherwise.

We're still getting allocations due to lambdas whenever creating a new wait-for-seconds or wait-for-frames in the current design though, which we could get rid of with the discrimated union special case above for these kinds of often used builtin conditions.

Thinking about the coroutines it would be nice if it automatically (or optionally) spreads out the updates over multiple frames.

We might be mixing concerns and use cases here, which I would try to avoid. As I understand Coroutines, all they do is provide a way to execute game updates over time, like in a scripted sequence of events, an animation, or maybe even a complex, instantiated behavior.

Spreading updates over frames for large numbers of objects is a completely different thing, and also one that a user could easily implement themselves with just the "basic" coroutines. Offloading lots of work somewhere or dealing with large workloads in general is something that I'd see in a different system, like SirePi suggested.

Which brings me to a point from SirePi that I'd like to get back to, which is whether wait conditions should actually be allowed to wrap execution and updates themselves, by design, rather than just waiting. I'm currently leaning towards "no", but maybe "yes" also makes sense when thinking about some Coroutine use cases:

Wait for a few seconds or frames
Wait until a condition is met
Perform an operation and wait until execution is done
Wait until another coroutine has finished
Wait until a number of coroutines all have finished, or until any of a group of coroutines has finished
Wait until processing of X is complete, maybe get results
Continue with some part of the work on a different thread, then go back to main for handling results

The above use cases form a gradient where Coroutines are at the top and C# Tasks are at the bottom. We should either avoid mixing the two very clearly, or embrace the similarity and go all-in with 100% async game code support where stuff like waiting for frames is just a special ValueTask.

Right now, I'm pretty sure we should not go all-in here, since async / await addresses a lot more cases that anything typically used in game code, and the required infrastructure for that is way more complicated than what we actually need for Coroutines. In that case, however, we should try to draw a clear line in the use case gradient. What I'd suggest where to put that line is here:

Wait for a few seconds or frames

Wait until a condition is met

Perform an operation and wait until execution is done

Wait until another coroutine has finished

Wait until a number of coroutines all have finished, or until any of a group of coroutines has finished

Wait until processing of X is complete, maybe get results

Continue with some part of the work on a different thread, then go back to main for handling results

Anything else on top of that is up to the user by defining custom conditions, allowing them to move the line as much down as they want for their project.

We could also put that line here:

Wait for a few seconds or frames

Wait until a condition is met

Perform an operation and wait until execution is done

Wait until another coroutine has finished

Wait until a number of coroutines all have finished, or until any of a group of coroutines has finished

Wait until processing of X is complete, maybe get results

Continue with some part of the work on a different thread, then go back to main for handling results

It might reduce clarity what coroutines and wait conditions are intended for, but if we rebrand exit conditions as "evaluation functions" that happen to return whether they're done, then we already provide the required extension points for users to decide if they want to use Coroutines that way or not, without explicitly suggesting it.

This turned out to be a bit longer than intended - I hope you can still gather something coherent from it 😄 What are your thoughts?

SirePi commented 4 years ago

I think the main discriminant here would be to decide if a wait condition should support arbitray code execution inside of its update method, or if it should be limited to keeping track of frames/time.

In the first case, there needs to be a way for the user to access a condition's internal state, either directly or through a referenced area, and the condition would effectively assume the meaning of "do this until that happens". In the second case, the whole thing can be simplified to the WaitFor example, even going as far as to remove the Funcs and staying with an bool + float struct, and the "do things until" part is left to the user to implement as he sees fit. I think this would be the bare minimum

public struct WaitFor
{
  public static readonly WaitFor NextFrame = new WaitFor(1, true);

  private readonly bool countFrames;
  private float internalValue;

  public WaitFor(float startingValue, bool countFrames)
  {
    this.internalValue = startingValue;
    this.countFrames = countFrames;
  }

  public bool Update()
  {
    this.internalValue -= (this.countFrames ? 1 : Time.DeltaTime);
    return this.internalValue <= 0;
  }

  public static WaitFor Frames(int frames)
  {
    return new WaitFor(frames, true);
  }

  public static WaitFor Seconds(float seconds)
  {
    return new WaitFor(seconds, false);
  }

  public static WaitFor TimeSpan(TimeSpan timeSpan)
  {
    return WaitFor.Seconds((float)timeSpan.TotalSeconds);
  }
}

It is not extensible anymore, but you can do everything in the Coroutine's enumerator method

Barsonax commented 4 years ago

Spreading updates over frames for large numbers of objects is a completely different thing, and also one that a user could easily implement themselves with just the "basic" coroutines. Offloading lots of work somewhere or dealing with large workloads in general is something that I'd see in a different system, like SirePi suggested.

After thinking about evenly spreading the load over multiple frames I agree with Adam that this probably should be a separate system.

I think the flexibility that coroutines give you in defining wait conditions makes it really complex/impossible to implement this properly so its not really suited. Might be better suited to a RecurringJob (with a option to run it on the main thread?) or something like that but thats for #603

@SirePi can you update your branch and fix that conflict with DualityTests.csproj and push it so we can see the updated code? If you have any questions just ask away.

ilexp commented 4 years ago

In the first case, there needs to be a way for the user to access a condition's internal state, either directly or through a referenced area, and the condition would effectively assume the meaning of "do this until that happens".

Do we actually need that? Consider the following with only an evaluate method:

// Various stuff happening in the coroutine before
GameObject someObj = ...;
int someCounter = ....;
float startTime = (float)Time.GameTimer.TotalSeconds;

// Now we wait, but we use coroutine state in the condition via captured scope
yield return new WaitFor.Condition(() =>
{
    // Access any captured state
    float elapsedTime = (float)Time.GameTimer.TotalSeconds - startTime;

    // Access and modify any captured state
    someCounter--;

    return someObj.Disposed || elapsedTime  > 20.0f || someCounter == 0;
});

No storage or access to the actual WaitFor object needed, since the lambdas captured scope doubles as a state object.

One thing that we should still work on is API naming - WaitFor.Condition actually makes it look like the rate at which the function is called, and thus state is mutated, is undefined, because one would expect a function without side effects here. If that's not the case, then a different name might be better, like WaitFor.ExecutionDone or so.

If we keep those conditions strictly waiting conditions like in your super minimal sample, the equivalent to the above would be:

// Various stuff happening in the coroutine before
GameObject someObj = ...;
int someCounter = ....;
float startTime = (float)Time.GameTimer.TotalSeconds;

while (true)
{
    float elapsedTime = (float)Time.GameTimer.TotalSeconds - startTime;
    someCounter--;

    if (someObj.Disposed || elapsedTime  > 20.0f || someCounter == 0)
        break;

    yield return WaitFor.NextFrame;
}

Not too bad either, and arbitrary conditions and logic is still possible. The callstack would also be easier to read when debugging the while / condition part, and no allocation is needed for any condition lambda. Could be the simplest approach overall, and we could still extend WaitFor capabilities later on, if needed.

SirePi commented 4 years ago

No storage or access to the actual WaitFor object needed, since the lambdas captured scope doubles as a state object.

Ah, I was under the assumption that lambdas were bad for the solution.. either is fine for me

If we keep those conditions strictly waiting conditions like in your super minimal sample, the equivalent to the above would be:
while (true)
{
    ...
    if (someObj.Disposed || elapsedTime  > 20.0f || someCounter == 0)
        break;

    yield return WaitFor.NextFrame;
}

or just

while (!(someObj.Disposed || elapsedTime  > 20.0f || someCounter == 0))
{
    ...
    yield return WaitFor.NextFrame;
}

For the naming, if we go with the first solution, I'd change WaitFor to WaitUntil, so you'd have

WaitUntil.NextFrame
WaitUntil.FramesElapsed(...)
WaitUntil.SecondsElapsed(...)
WaitUntil.TimeSpanElapsed(...)
WaitUntil.IsComplete(() => ...) / IsTrue(() => ...)

EDIT: Meh.. maybe ExecutionComplete gives more the idea of something that is actually working until it's done

ilexp commented 4 years ago

Maybe we could use the term "passed" instead of "elapsed"? I know "elapsed" is used in established API, but "passed" sounds a bit more natural in this context and works better with "frames".

On the execution / condition thing, we could rephrase this as a continuously updated state that the coroutine will be in until X.

How about the following API?

public struct WaitUntil
{
    public delegate bool StateUpdater();

    // ...

    public static readonly WaitUntil NextFrame;

    // ...

    public static WaitUntil FramesPassed(int frameCount);
    public static WaitUntil SecondsPassed(float seconds, bool gameTime = true); // Real time, if false
    public static WaitUntil TimePassed(TimeSpan duration, bool gameTime = true); // Real time, if false
    public static WaitUntil StateCompleted(StateUpdater func);
}

That way, it would look like this in practice:

yield return WaitUntil.NextFrame;
yield return WaitUntil.FramesPassed(5);
yield return WaitUntil.SecondsPassed(0.5f);
yield return WaitUntil.TimePassed(TimeSpan.FromMinutes(2.0f));
yield return WaitUntil.StateCompleted(() =>
{
    someObj.Health -= 1.7f;
    return someObj.Health < 0.0f;
});

No storage or access to the actual WaitFor object needed, since the lambdas captured scope doubles as a state object.

Ah, I was under the assumption that lambdas were bad for the solution.. either is fine for me

Not necessarily, but you're right that relying on captured scope would prevent users from using a more optimized approach for their frequently used states / waiting conditions. We could allow providing an optional parameter to the method, modifying the API like so:

public delegate bool StateUpdater();
public delegate bool StateUpdater<T>(T arg) where T : class;

public static WaitUntil StateCompleted(StateUpdater func) where T : class;
public static WaitUntil StateCompleted<T>(T arg, StateUpdater<T> func) where T : class;

// The WaitUntil struct just stores this as an object field and casts before invoking the func, allowing WaitUntil to not be generic.

So people could write their own wait condition libs, like this:

public static class FooConditions
{
    public static readonly StateUpdater Poisoned = (Character obj) =>
    {
        obj.Health -= 1.7f;
        return obj.Health < 0.0f;
    };
}

yield return WaitUntil.StateCompleted(someObj, FooConditions.Poisoned);

I don't really have a strong opinion either way here, maybe just see what sticks?

Barsonax commented 4 years ago

I would prefer the API with the stateobject being passed as a parameter as that should prevent allocating a closure object thus improving performance. It makes it more explicit about whats happening. Also like you pointed out it makes it easier to make a 'wait condition' lib.

Also if we go this route I think we should not provide 2 flavors here and just choose one of them.

ilexp commented 4 years ago

Also if we go this route I think we should not provide 2 flavors here and just choose one of them.

There are still conditions without any state or anything to update - for them, the parameter would be useless, and it would be nice if it could be omitted entirely in that case. Internally, it would just be a null parameter, but the API wouldn't force the user to specify it (and an unused T parameter).

Barsonax commented 4 years ago

There are still conditions without any state or anything to update - for them, the parameter would be useless, and it would be nice if it could be omitted entirely in that case. Internally, it would just be a null parameter, but the API wouldn't force the user to specify it (and an unused T parameter).

Hmm ye now I think of it the main thing is we want to prevent users from causing a closure to be captured implicitly on every invocation. If C# would have something like a 'static delegate' that does not allow capturing scope this would be solved already. I think this is one of the language proposals for a future C# release.

yield return WaitUntil.StateCompleted(() =>
{
    someObj.Health -= 1.7f;
    return someObj.Health < 0.0f;
});

I still do not understand the need for a stateupdater delegate though. Why not do this in the enumerator method itself and let the yields just care about how long to wait? Any conditions could be defined in the enumerator method itself as well:

while(someObj.Health > 0.0f)
{
    someObj.Health -= 1.7f;
    yield return WaitUntil.NextFrame
}
// do some more stuff

No need for delegates so no performance penalty there either.

SirePi commented 4 years ago

I admit that extreme optimization is not my strong suite, but is it really such a loss of performance calling a delegate instead of a method? I mean, besides the inherent cost of using one or the other, once they are defined there's no difference I believe 🤔 no? I mean, they don't have to be re-instantiated every cycle

Barsonax commented 4 years ago

I admit that extreme optimization is not my strong suite, but is it really such a loss of performance calling a delegate instead of a method?

Well the difference between invoking a method and a delegate is not that big. The compiler does lose the ability to inline the body of the delegate which can make a bigger difference in some cases. For the rest invoking a delegate is pretty comparable to a calling a interface method or a virtual method.

However if said delegate has to allocate a closure (a object which holds the captured context) on every call then the delegate will be much slower. This basically happens when you use a variable that was defined outside of the scope of the delegate body so its pretty implicit. I also believe it only generates the type of this object once per method. So if you have 2 delegate bodies in 1 method with body 1 using variable a and body 2 using variable 2 it will reuse the same type even though body 1 doesn't need b and body 2 doesn't need a.

One easier way to spot these 'hidden' allocations is to use a extension like the heap allocation analyzer https://marketplace.visualstudio.com/items?itemName=MukulSabharwal.ClrHeapAllocationAnalyzer

We should try to make a API that makes it easy for developers to write high performance code so I think we should try to avoid having them to worry about closures and delegates.

ilexp commented 4 years ago

We should try to make a API that makes it easy for developers to write high performance code so I think we should try to avoid having them to worry about closures and delegates.

Agree 100%. However, as long as we have any way to specify a delegate, we can't prevent users to use a dynamically allocated one, or a closure.

Hmm ye now I think of it the main thing is we want to prevent users from causing a closure to be captured implicitly on every invocation.

As far as the non-parameter variant goes, consider this:

public static class FooConditions
{
    public static readonly StateUpdater<Character> Poisoned = (Character obj) =>
    {
        obj.Health -= 1.7f;
        return obj.Health < 0.0f;
    };
    public static readonly StateUpdater ItsTimeForLunch = () =>
    {
        // No idea if that code is correct, but you know what I mean:
        return DateTime.Now.Hours >= 12 && DateTime.Now.Hours <= 14;
    };
}

yield return WaitUntil.StateCompleted(someObj, FooConditions.Poisoned);
yield return WaitUntil.StateCompleted(FooConditions.ItsTimeForLunch);

No parameters needed for that one, despite not using any closure - that would be the main reason why I'd provide the parameterless variant as well. Otherwise you'd have to make it a StateUpdater<object> with an awkward unused parameter for those cases.

I still do not understand the need for a stateupdater delegate though. Why not do this in the enumerator method itself and let the yields just care about how long to wait? Any conditions could be defined in the enumerator method itself as well:

We could remove this use case entirely and not have state updater conditions at all, which I would be fine with as well. The state lib use case still sounds intriguing, but I guess it could easily be expressed differently, allowing us to remove the lambda / condition based WaitUntil entirely (at the cost of being more verbose in those cases?)

Then again, with the custom condition variant, it would also enable users to specify an implicit cast from some type X to a WaitUntil struct, allowing them to yield return whatever for their convenience. Like waiting for an animation to finish, or a character to finish speaking, and so on.

Barsonax commented 4 years ago

We could remove this use case entirely and not have state updater conditions at all, which I would be fine with as well. The state lib use case still sounds intriguing, but I guess it could easily be expressed differently, allowing us to remove the lambda / condition based WaitUntil entirely (at the cost of being more verbose in those cases?)

Well atleast with the example it seems to be more verbose with the lambda condition:

yield return WaitUntil.StateCompleted(() =>
{
    someObj.Health -= 1.7f;
    return someObj.Health < 0.0f;
});

while(someObj.Health > 0.0f)
{
    someObj.Health -= 1.7f;
    yield return WaitUntil.NextFrame
}

Then again, with the custom condition variant, it would also enable users to specify an implicit cast from some type X to a WaitUntil struct, allowing them to yield return whatever for their convenience. Like waiting for an animation to finish, or a character to finish speaking, and so on.

Hmm I don't think you want to use implicit casts for this. Its too implicit I guess.

You could move the condition to a separate method that returns a IEnumerable<WaitUntil> You cannot directly yield return that though but you can use concat to chain things together or just iterate it yourself.

On the other hand with unit testing that would make it easy to check 'will this complete after x frames/seconds etc' since its just a piece of data and doesn't make any assumptions on how it will be used.

SirePi commented 4 years ago

I put together a first draft based on the story so far. WaitUntil.zip

I believe the only thing left to finalize is in regard to the StateCompleted methods.. leave them or take them away? From the discussion I think we can summarize the main points for each way as follows:

With StateCompleted	Without StateCompleted
The coroutine's body looks more like a list of steps that are to be carried out until the relevant exit condition is met, probably without loops	The coroutine's body will most definitely contain loops that might make it harder to read
More verbose to write	Shorter, cleaner code on a line-by-line basis
Basically just syntactic sugar, doesn't add anything that could already be done without it	Requires the user to think about the frame-by-frame evolution of the coroutine
Could introduce performance issues if not used properly, but it can mitigated by a proper API	The same errors can be done by the user anyway
...	...

Of course feel free to add new ones or refute some of them if I misinterpreted them

Barsonax commented 4 years ago

The coroutine's body will most definitely contain loops that might make it harder to read

Depends I think the while construct is quite readable as well. It reads like keep yielding until the condition is false. No need to learn any kind of API for that.

Could introduce performance issues if not used properly, but it can mitigated by a proper API

As long as its not possible to define a delegate that cannot capture context we cannot prevent closures from happening actually. We can only hope to steer ppl towards using the correct API's.

Now I think of it one more limitation of StateCompleted is that you cannot do things like 'check this condition again in 10 seconds instead of next frame'.

You can however easily do this with the loop construct

while(someObj.Health > 0.0f)
{
    someObj.Health -= 1.7f;
    yield return WaitUntil.SecondsPassed(10f);
}

Its also easy to move the condition after the yield if so desired:

do
{
    someObj.Health -= 1.7f;
    yield return WaitUntil.SecondsPassed(10f);
}
while(someObj.Health > 0.0f);

SirePi commented 4 years ago

You can however easily do this with the loop construct
while(someObj.Health > 0.0f)
{
    someObj.Health -= 1.7f;
    yield return WaitUntil.SecondsPassed(10f);
}
Its also easy to move the condition after the yield if so desired:
do
{
    someObj.Health -= 1.7f;
    yield return WaitUntil.SecondsPassed(10f);
}
while(someObj.Health > 0.0f);

Makes a lot of sense.. I believe we can agree to start with a timing-only related management for now. Will fix the code accordingly

ilexp commented 4 years ago

Sounds pretty solid overall. Let us know when you're ready for the next review!

SirePi commented 4 years ago

I am, review away!

Barsonax commented 4 years ago

Did a small benchmark with benchmarkdotnet and these are the results:

// * Summary *

BenchmarkDotNet=v0.12.1, OS=Windows 10.0.18363.778 (1909/November2018Update/19H2)
AMD Ryzen 9 3900X, 1 CPU, 24 logical and 12 physical cores
  [Host]     : .NET Framework 4.8 (4.8.4150.0), X64 RyuJIT
  DefaultJob : .NET Framework 4.8 (4.8.4150.0), X64 RyuJIT

| Method |      N |            Mean |         Error |        StdDev | Gen 0 | Gen 1 | Gen 2 | Allocated |
|------- |------- |----------------:|--------------:|--------------:|------:|------:|------:|----------:|
|    Foo |      1 |        26.96 ns |      0.386 ns |      0.361 ns |     - |     - |     - |         - |
|    Foo |      3 |        70.55 ns |      0.341 ns |      0.319 ns |     - |     - |     - |         - |
|    Foo |     10 |       226.96 ns |      0.438 ns |      0.342 ns |     - |     - |     - |         - |
|    Foo |    100 |     2,301.60 ns |      9.649 ns |      8.553 ns |     - |     - |     - |         - |
|    Foo |   1000 |    22,102.72 ns |     71.463 ns |     66.847 ns |     - |     - |     - |         - |
|    Foo |  10000 |   218,520.84 ns |    949.447 ns |    888.113 ns |     - |     - |     - |         - |
|    Foo | 100000 | 2,475,735.87 ns | 41,625.357 ns | 32,498.341 ns |     - |     - |     - |         - |

So no allocations in the update loop which is very nice. Seems the coroutine itself has about ~20ns runtime overhead and it scales linearly with the amount of coroutines that are running.

The code:

    class Program
    {
        static void Main(string[] args)
        {
            BenchmarkRunner.Run<CoroutineBenchmark>();         
        }
    }

    class CoroutineObject
    {
        public int Value { get; set; }
    }

    [MemoryDiagnoser]
    public class CoroutineBenchmark
    {
        [Params(1, 3, 10, 100, 1000, 10000, 100000)]
        public int N { get; set; }

        public Scene scene;

        [GlobalSetup]
        public void Setup()
        {
            scene = new Scene();
            scene.Activate();

            CoroutineObject obj = new CoroutineObject();
            for (int i = 0; i < N; i++)
            {
                Coroutine coroutine = Coroutine.Start(scene, BasicRoutine(obj), $"basic {i}");
            }              
        }

        [Benchmark]
        public void Foo()
        {
            scene.CoroutineManager.Update();
        }

        private static IEnumerable<WaitUntil> BasicRoutine(CoroutineObject obj)
        {
            while (true)
            {
                yield return WaitUntil.NextFrame;
            }
        }
    }

ilexp commented 4 years ago

Sounds like you guys got this and I think the general direction of this is pretty good. I'll remove my change requests and leave the rest of the review and merge process to you 👍 Ping me if you need me on anything down the line.

SirePi commented 4 years ago

@ilexp, @Barsonax Do you think it's sensible to change this

public static Coroutine Start(Scene scene, IEnumerable<WaitUntil> method, string name = null)

public static Coroutine Start(IEnumerable<WaitUntil> method, string name = null, Scene scene = null)

where if scene is null, it will use Scene.Current ?

Barsonax commented 4 years ago

Was just giving those methods a look.

Not really a fan of the static utility methods in the Coroutine class as they use static state. It also makes unit testing harder as you now also need a scene instance and a whole duality environment etc. I think we should just use the CoroutineManager instance on the scene itself. Maybe add a utility method to the scene to easily do this for you so you can write scene.StartCoroutine(....).

We should also make it possible for the CoroutineManager class to accept a IEnumerable as well.

With these changes the coroutine logic no longer has dependencies on a scene which makes it easier to test as well as you will see in my benchmark below. Also expanded my benchmark a bit to also take into account the starting of coroutines:

    class Program
    {
        static void Main(string[] args)
        {
            BenchmarkRunner.Run<CoroutineBenchmark>();         
        }
    }

    [MemoryDiagnoser]
    public class CoroutineBenchmark
    {
        [Params(1, 3, 10, 100, 1000, 10000, 100000)]
        public int N { get; set; }

        private CoroutineManager _manager = new CoroutineManager();

        [GlobalSetup]
        public void Setup()
        {
            for (int i = 0; i < N; i++)
            {
                _manager.StartNew(EndlessCoroutine().GetEnumerator(), ""); //If it accepted a IEnumerable directly I wouldnt have to call .GetEnumerator
            }              
        }

        [Benchmark]
        public void Endless()
        {
            _manager.Update();
        }

        [Benchmark]
        public void Start_N_Coroutines()
        {
            var manager = new CoroutineManager();
            for (int i = 0; i < N; i++)
            {
                manager.StartNew(EndlessCoroutine().GetEnumerator(), "");
            }
        }

        private static IEnumerable<WaitUntil> EndlessCoroutine()
        {
            while (true)
            {
                yield return WaitUntil.NextFrame;
            }
        }
    }

And the results

// * Summary *

BenchmarkDotNet=v0.12.1, OS=Windows 10.0.18363.778 (1909/November2018Update/19H2)
AMD Ryzen 9 3900X, 1 CPU, 24 logical and 12 physical cores
  [Host]     : .NET Framework 4.8 (4.8.4150.0), X64 RyuJIT
  DefaultJob : .NET Framework 4.8 (4.8.4150.0), X64 RyuJIT

|             Method |      N |             Mean |          Error |         StdDev |     Gen 0 |     Gen 1 |    Gen 2 |  Allocated |
|------------------- |------- |-----------------:|---------------:|---------------:|----------:|----------:|---------:|-----------:|
|            Endless |      1 |         28.85 ns |       0.259 ns |       0.243 ns |         - |         - |        - |          - |
| Start_N_Coroutines |      1 |        367.39 ns |       6.970 ns |       7.458 ns |    3.1543 |         - |        - |     5241 B |
|            Endless |      3 |         76.11 ns |       0.757 ns |       0.708 ns |         - |         - |        - |          - |
| Start_N_Coroutines |      3 |        450.20 ns |       8.804 ns |      10.138 ns |    3.2678 |         - |        - |     5432 B |
|            Endless |     10 |        235.50 ns |       1.433 ns |       1.270 ns |         - |         - |        - |          - |
| Start_N_Coroutines |     10 |        803.69 ns |       9.051 ns |       8.466 ns |    3.8185 |         - |        - |     6348 B |
|            Endless |    100 |      2,315.79 ns |      45.506 ns |      54.172 ns |         - |         - |        - |          - |
| Start_N_Coroutines |    100 |      4,224.03 ns |      32.811 ns |      30.691 ns |   10.1624 |         - |        - |    16883 B |
|            Endless |   1000 |     22,640.59 ns |     196.131 ns |     183.461 ns |         - |         - |        - |          - |
| Start_N_Coroutines |   1000 |     41,746.56 ns |     578.692 ns |     541.309 ns |   61.5845 |    9.5215 |        - |   117989 B |
|            Endless |  10000 |    228,322.10 ns |   1,851.573 ns |   1,641.371 ns |         - |         - |        - |          - |
| Start_N_Coroutines |  10000 |  1,274,988.19 ns |  15,588.004 ns |  14,581.029 ns |  208.9844 |  107.4219 |  39.0625 |  1230680 B |
|            Endless | 100000 |  2,362,414.09 ns |  46,363.687 ns |  64,995.414 ns |         - |         - |        - |          - |
| Start_N_Coroutines | 100000 | 16,305,813.30 ns | 317,686.890 ns | 521,968.671 ns | 2000.0000 | 1281.2500 | 437.5000 | 11731826 B |

SirePi commented 4 years ago

Originally I didn't want to change too much in the Scene, but since we are going for a major release, I don't see why not at this point

ilexp commented 4 years ago

How about making this one an extension method?

public static Coroutine StartCoroutine(this Scene scene, IEnumerable<WaitUntil> method, string name = null)

// In some component:
this.Scene.StartCoroutine(this.FooRoutine);

Since most coroutine logic is nicely aggregated in the coroutine manager with little effect on scenes otherwise, this would keep the separation but provide the same convenience. It would also keep autocomplete clean unless there's a using Duality.Coroutines in the file.

SirePi commented 4 years ago

SceneExtensions it is then.. where? under Resources, or do you prefer a separate directory (Extensions?) with the Resources namespace?

ilexp commented 4 years ago

SceneExtensions would be too general, maybe something like CoroutineSceneExtensions in the coroutines namespace, and right next to the coroutine files. Would keep the core of the coroutine logic and API nicely aggregated.

SirePi commented 4 years ago

Review plz @ilexp @Barsonax 😄

Barsonax commented 4 years ago

New benchmark results are alot better. Down to like ~14ns per coroutine. Will review the code itself later.

// * Summary *

BenchmarkDotNet=v0.12.1, OS=Windows 10.0.18363.778 (1909/November2018Update/19H2)
AMD Ryzen 9 3900X, 1 CPU, 24 logical and 12 physical cores
  [Host]     : .NET Framework 4.8 (4.8.4150.0), X64 RyuJIT
  DefaultJob : .NET Framework 4.8 (4.8.4150.0), X64 RyuJIT

|             Method |      N |             Mean |          Error |         StdDev |     Gen 0 |     Gen 1 |    Gen 2 |  Allocated |
|------------------- |------- |-----------------:|---------------:|---------------:|----------:|----------:|---------:|-----------:|
|            Endless |      1 |         22.10 ns |       0.137 ns |       0.129 ns |         - |         - |        - |          - |
| Start_N_Coroutines |      1 |        323.25 ns |       4.824 ns |       4.276 ns |    3.0441 |         - |        - |     5057 B |
|            Endless |      3 |         50.71 ns |       0.612 ns |       0.572 ns |         - |         - |        - |          - |
| Start_N_Coroutines |      3 |        391.56 ns |       6.619 ns |       6.192 ns |    3.1495 |         - |        - |     5236 B |
|            Endless |     10 |        152.81 ns |       2.014 ns |       1.884 ns |         - |         - |        - |          - |
| Start_N_Coroutines |     10 |        608.70 ns |       7.751 ns |       7.250 ns |    3.5210 |         - |        - |     5850 B |
|            Endless |    100 |      1,359.68 ns |      23.791 ns |      22.254 ns |         - |         - |        - |          - |
| Start_N_Coroutines |    100 |      3,414.93 ns |      41.126 ns |      38.469 ns |    8.3008 |         - |        - |    13794 B |
|            Endless |   1000 |     13,597.57 ns |     207.006 ns |     193.634 ns |         - |         - |        - |          - |
| Start_N_Coroutines |   1000 |     33,702.82 ns |     474.036 ns |     420.220 ns |   62.8052 |    0.4272 |        - |   105600 B |
|            Endless |  10000 |    137,642.43 ns |   1,713.818 ns |   1,603.107 ns |         - |         - |        - |          - |
| Start_N_Coroutines |  10000 |  1,128,187.21 ns |  19,851.838 ns |  18,569.421 ns |  191.4063 |   93.7500 |  41.0156 |  1145978 B |
|            Endless | 100000 |  1,388,138.39 ns |   9,962.400 ns |   9,318.835 ns |         - |         - |        - |          - |
| Start_N_Coroutines | 100000 | 14,298,808.54 ns | 177,963.971 ns | 166,467.609 ns | 1796.8750 | 1125.0000 | 437.5000 | 10925768 B |

Barsonax commented 4 years ago

Looking at the Update method I think we cannot really further optimize it without making something like a customized queue. Its probably good enough for now.

Barsonax commented 4 years ago

I think the code itself now looks pretty good and performs well. Next step is to review the unit tests. I think we can do better on the naming and readability:

For the general case we could do this (maybe a parameterized way?). The Assert will give a nice exception telling you the index of the mismatching element if something is wrong.:

        [Test]
        public void Coroutine_Evolution()
        {
            //ARRANGE
            var manager = new CoroutineManager();
            var obj = new CoroutineObject();
            var expected = new[]
            {
                (10, CoroutineStatus.Running),
                (20, CoroutineStatus.Running),
                (20, CoroutineStatus.Running),
                (20, CoroutineStatus.Running),
                (30, CoroutineStatus.Complete),
                (30, CoroutineStatus.Complete)
            };

            //ACT
            Coroutine coroutine = manager.StartNew(this.BasicRoutine(obj), "basic");

            // Really only interested in the length of the expected results actually so maybe we should just use Range here
            var results = expected.Select(x =>
            {
                var result = (obj.Value, coroutine.Status);
                manager.Update();
                return result;
            }).ToArray();

            //ASSERT
            CollectionAssert.AreEqual(expected, results);
        }

Some more special cases should be done in a separate test so their intention is clear from the name of the method:

        [Test]
        public void StartCoroutine_StartsImmediately()
        {
            //ARRANGE
            var manager  = new CoroutineManager();
            var obj = new CoroutineObject();

            //ACT
            Coroutine coroutine = manager.StartNew(this.BasicRoutine(obj), "basic");

            //ASSERT
            Assert.AreEqual(10, obj.Value);
            Assert.AreEqual(CoroutineStatus.Running, coroutine.Status);
        }

The realtime based tests are a bit more difficult. I think we cannot really get around also testing the whole Time class here.

We also need a test that verifies that the scene update correctly calls the coroutine update method but we don't have to do this for every test which is the reason I written the examples above without a scene being present.

@ilexp maybe you have some more suggestions for the code or the tests?

ilexp commented 4 years ago

I can do a review soon-ish, will put it on my ToDo.

Barsonax commented 4 years ago

Got nothing on the tests though - they look pretty clean to me and I don't have a strong opinion on any specific design here.

For the most part yeah, naming could be a bit better as Basic is just too vague. To solve this one might also have to split some tests so you can properly name special cases.

What I don't like however is that static dependency on the Time class in the WaitUntil struct which makes the realtime test more complex because we are now also testing the Time class. Not really a way around this though.

Barsonax commented 4 years ago

What happens if the gameobject/component/or any other object that started the coroutine gets removed? I think the coroutine will keep running till it ends so it possibly uses a gameobject thats no longer in use.

You could say this is something thats the developers responsibility but the very least we should do is to make a notice of this in the documentation. If we want to go further than that (probably not in this PR) then we could provide a API that automatically cleans up the coroutine when the context where its in started ends.

SirePi commented 4 years ago

You are absolutely correct that this is an eventuality; on the other hand, there is no way for the Manager to know beforehand what the Coroutine will affect (it could affect the spawning object, something else, no objects at all, ...). Plus, the update method is already catching eventual exceptions so it won't fail catastrophically.