Closed martinothamar closed 7 months ago
Note to future self: Putting more of the code into Mediator
/Mediator.Abstractions
is good, easier to read and iterate on. Source generation is mostly important to avoid reflection, mapping opaque messages to concrete messages and resolving the correct handlers. I.e. once the concrete message is unwrapped and the underlying handlers are instantiated and ready, the benefits of source generation/compilation analysis are fully exhausted I think
Impressive changes, haven't done a major deep dive yet.
It might be possible to create a more memory efficient awaiter for known lengths. Task[]
will allocate 24 + remaining length * 8
bytes, this means that in the common case that 2 notifications are asynchronous we'll have to allocate 40 bytes to store 2 references to Task
values.
Instead Mediator could have methods for common array lengths where instead of having a nullable array we have task variables, one for each possibly asynchronous task. This way the array is inlined inside of the generated async state machine saving the 24 bytes for the array.
I think this will always use less memory than the array equivialent, up to 4 handlers. Thereafter unlikely setups may use slightly more memory.
async ValueTask Publish2(INotificationHandler[] handlers, TNotification notification, CancellationToken cancellationToken)
{
Task task0 = handler[0].Handle(notification, cancellationToken).AsTask();
Task task1 = handler[1].Handle(notification, cancellationToken).AsTask();
List<Exception>? exceptions = null;
try
{
await task0;
}
catch (Exception ex)
{
exceptions ??= new List<Exception>(1);
exceptions.Add(ex);
}
// await each one etc
}
Took me a while to be able to benchmark this properly... But yes it makes a difference, atleast for the synchronous case
Top: before Bottom: after
Now I just have to cleanup a bit before I can push... made some significant config changes to make it easier to benchmark and test different things based on compile time config
Thanks for sharing the benchmarks!
Looks like it saved a small amount of memory for the async case, although I can't tell why it saves 32 (35) bytes and not the expected 24. I'll look at the lowered code in the morning, benchmarkdotnet might be measuring memory usage incorrectly.
The synchronous case getting a speed boost is unexpected. Perhaps it's due to loop unrolling.
Yeah when it's manually unrolled like that I'm guessing it can elide bounds checks as well, I kind of doubt it is able to do so for enumerators even if they are plain structs (not boxed, no virtual calls)
EDIT: nvm that was not the custom enumerator, codegen looks identical. I guess it just saves the control flow of the loop? branches and jumps, even if they are predictable
Updated the PR desc with a full run through benchmarks as well
Great points as always! Reduced the allocations in the asynchronous case.
I also made sync optimization for the foreach publisher, just by "branching inwards".
After extending NotificationHandlers<TNotification>
to implement IEnumerable<>
some inlining decisions changed and 2x'ed latency for some cases, so with this latest commit I'm a bit more careful about extracting the non-common branches into separate functions, seems to do the trick :smile:
Also included sample and docs. I'm pretty happy with this now
True it's pretty naive, I guess I'm betting there aren't many inheritance hierarchies out there, atleast there isn't in my code :P
We can improve it in subsequent PRs, just want to get a preview out now so that I can test it some more
Solves #115 and #138
INotificationPublisher
abstraction similar to that of MediatR, apart from making the interface slightly worse while allowing better performance (avoiding closures, same as for the pipeline behavior interface)Mediator.Abstractions
references .NET Standard 2.0 + NET 6.0 with conditional package references toMicrosoft.Bcl.AsyncInterfaces
forIAsyncEnumerable<T>
System.Runtime.CompilerServices.Unsafe
to avoid cost of castingPolySharp
for some attributesStill unsure if netstandard2.0 is the way to go.. that might chage
TODO
ValueTask.WhenAll
variant based onIValueTaskSource
Benchmarks