Open JakenVeina opened 3 months ago
The std dev was high on the IReadOnly run. Suggests maybe a background task was running on your pc.
Might be worth rerunning those two benchmarks.
I am super impressed with the level of Research that has gone into this.
I love that you considered reducing the size of a change by 4 bytes. Every tiniest bit of memory required will lead to better performance. I have to be honest that I have always been a little uncomfortable with how much a changeset allocates, yet came to the conclusion of that for 99% of scenarios it's irrelevant. I took the view that for super high performant scenarios then perhaps dynamic data is not the best fit.
Interestingly, in my experience of working in the finance world for a very long time, I have never found DD to be a bottle neck yet I always regard squeezing every last improvement out of it to be the mission goal. On the other hand gaming thrashes finance for performance requirements which is why R3 now exits
I love that you considered reducing the size of a change by 4 bytes.
I cannot take credit for considering that, at the time, but that thought occurred to me yesterday when I was writing this up.
The fast one seems like it adds too much complexity. Not sure it is worth the hassle to save a few bytes. Also, it seems like having a big struct and then copying to a little struct doesn't really get you anything, except not being able to use the fields that are not valid.
Can't we split the difference between performance and simplicity? And then if the performance isn't good enough, we can talk about how to find it?
The std dev was high on the IReadOnly run. Suggests maybe a background task was running on your pc.
After several runs with all extraneous processes shut down, the results aren't really any different. I was still seeing StDev spike up to as much as 10ms, randomly on different tests. And even then, the mean results weren't meaningfully different.
This was the lowest set of standard-deviation values I was able to get.
| Method | Mean | Error | StdDev | Ratio | Gen0 | Gen1 | Allocated | Alloc Ratio |
|--------------------------- |---------:|--------:|--------:|------:|-----------:|-----------:|----------:|------------:|
| ConsumerInterpretedChanges | 123.6 ms | 2.46 ms | 3.11 ms | 0.45 | 44000.0000 | 1000.0000 | 177.64 MB | 0.95 |
| DynamicDataChangeSets | 272.8 ms | 3.05 ms | 2.55 ms | 1.00 | 46000.0000 | 45000.0000 | 187.38 MB | 1.00 |
| ImmutableArrayChanges | 117.6 ms | 1.51 ms | 1.34 ms | 0.43 | 38000.0000 | - | 154.84 MB | 0.83 |
| IReadOnlyListChanges | 162.0 ms | 2.51 ms | 2.09 ms | 0.59 | 39000.0000 | - | 158.96 MB | 0.85 |
| PolymorphicChangeSets | 130.2 ms | 2.15 ms | 2.20 ms | 0.48 | 36000.0000 | - | 143.97 MB | 0.77 |
I could probably figure out a setup for running benchmarks on my server, there shouldn't be any real risk of CPU time being unavailable there. Or I could just setup a VM there with dedicated cores.
I would say there's not much point. Given you gotten it fairly consistent with the other in that run results. Confirms the trend that the immutable version is faster with less allocations.
I've just re-read this and I quite like the Type-Interpreted changeset idea.
I suspect using simple switch expression statements would mean we could avoid using the .AsXXX() methods.
I'll pull the branch locally tomorrow, have a play with it and get a better feel.
the Type-interpreted changeset idea
The PolymorphicChangeSets
namespace?
we could avoid using the .AsXXX() methods.
The .AsXXX()
methods are the entire point, within that namespace. Consumers don't have to know and remember the inner-details of which fields are populated when, within a Change
, they just call the .AsXXX()
method they want, after checking .Type
. Not having the .AsXXX()
methods is exactly what's in the ConsumerInterpretedChanges
namespace, which consistently performed slightly worse.
I've just grabbed your code, locally and I see that I misunderstood - I looked in more detail but I propose something altogether simpler. The following idea is even lighter weight as it dispenses with ChangeType / Reason and there's no need for the AsXXX() methods.
public interface IKeyedChange<out TKey, out TItem>
{
public TKey Key { get; }
public TItem Item { get; }
}
public readonly record struct Add<TKey, TItem>(TKey Key, TItem Item) : IKeyedChange<TKey, TItem>;
public readonly record struct Remove<TKey, TItem>(TKey Key, TItem Item) : IKeyedChange<TKey, TItem>;
public readonly record struct Update<TKey, TItem>(TKey Key, TItem Item, TItem PreviousItem) : IKeyedChange<TKey, TItem>;
To iterate:
var changes = ImmutableArray<IKeyedChange<TKey, TItem>>.Empty;
foreach (IKeyedChange<TKey, TItem> keyedChange in changes)
{
switch (keyedChange)
{
case Add<TKey, TItem> add:
// Do Add
break;
case Update<TKey, TItem> update:
// Do Update
break;
case Remove<TKey, TItem> remove:
// Do Remove
break;
}
}
I like that this code looks like the original DD but it solves the change set problem.
This is what I thought Type-Implied changes were doing - but looking at code on a phone is not always the best !
Edit:
The same idea can be extended to List using:
public interface IIndexChange<out TItem>
{
public int Index { get; }
public TItem Item { get; }
}
So basically very similar to your idea, but use an interface for constraint and switch expression to match types.
A further thought:
Do we even need a changeset? Surely ImmutableArray<IKeyedChange<TKey, TItem>>
is sufficent?
ImmutableArray<IKeyedChange<TKey, TItem>>
This boxes every single individual change onto the heap. I'll write up a little demo when I get a chance, but this is almost guaranteed to tank both speed and memory usage. by a TON.
This is also fundamentally the same as what I did in the PolymorphicChangeSets
namespace.
Do we even need a changeset? Surely ImmutableArray<IKeyedChange<TKey, TItem>> is sufficent?
Sufficient, yes. But there's potential value in gaining the .Type
property on each ChangeSet<>
. It allows consumers to skip looping over every change, and use .Clear()
and .AddRange()
, instead, in some cases. when possible. This one's definitely worth a demo and benchmarking, to see how much benefit this provides. In fact, we could really easily take the existing code and hard code ChangeSetType.Update
for all changesets, and that would be pretty close to dropping the field completely.
It allows consumers to skip looping over every change, and use .Clear() and .AddRange(), instead, in some cases. when possible
Great point
This boxes every single individual change onto the heap. I'll write up a little demo when I get a chance, but this is almost guaranteed to tank both speed and memory usage. by a TON.
Dang. It would be good to benchmark regardless.
I looks to me like it's certain that we'll be using immutable arrays. Secondly I think AddRange, Clear etc is a great idea and worth including from day 1.
So it's down to what the individual change looks like
Finally it's also worth considering the old skool simple singular representation:
pubic enum ChangeReason
{
//.... blah
}
public readonly record struct Change<TKey, TItem>(ChangeReason Reason TKey Key, TItem Item, TItem? PreviousItem = default);
This is almost identical to the current version, but we are no longer carrying the indices.
I think most importantly it has to balance good performance with simplicity. I suggest we should make a commitment as soon as we can, get the basic operators working ie WhereItems, SelectItems etc and re-review
@dwcullop what are your thoughts?
public readonly record struct Change<TKey, TItem>(ChangeReason Reason TKey Key, TItem Item, TItem? PreviousItem = default);
The key difference I like is the idea to write the API such that you CAN'T create an invalid change, E.G. an Add
where you also supply a PreviousItem
. Sure we could just throw an exception in the constructor if you do this, but even better than that (for both DX and performance) is to just have separate API calls you make for an Add
versus Replace
or whatever. And for that, the implementation is named static
methods.
Secondly I think AddRange, Clear etc is a great idea and worth including from day 1.
I'm still not sure if these operations should be supported at the Change
level or the ChangeSet
level. I.E. today, ListChangeReason
includes Clear
and AddRange
, while my proposed implementation has Clear
and Reset
upon ChangeSetType
. I think the next thing I'm going to do for vNext is go back to my little test project where I did those other modeling benchmarks, and add these two options.
I've been thinking about add / clear etc. They could only work if included at a change level. The reason is it's perfectly legit to have multiple operations in a single changeset. For example, clear / add range would be a common example, but also if changes are batched then any combination can occur
They could only work if included at a change level.
I don't see how that makes sense, but having them at the change level would allow for more-sophisticated changesets to be generated, like you say. Like, having an AddRange
followed by a Remove
as part of the same operation. But I question how much value there is in such a thing. How often would someone WANT to generate such changesets? Because the cost to have them exist at the change level is having the Change
types be more complex, and most-notably, take up more space. Each Change
type would need at least an additional 8-bytes, to store an ImmutableArray<>
of the items, and that's a cost upon single-item changes as well.
It definitely sounds like it's worth a demo and benchmark, like the others.
For reference, this is what consuming a changeset looks like, with the current proposal:
switch(changeSet.Type)
{
case ChangeSetType.Clear:
capturedItems.Clear();
break;
case ChangeSetType.Reset:
{
capturedItems.Clear();
var foundAdditionIndex = changeSet.Changes.BinarySearch(static change => change.Type is KeyedChangeType.Addition);
if (foundAdditionIndex is not -1)
{
var firstAdditionIndex = foundAdditionIndex;
while ((firstAdditionIndex > 0) && (changeSet.Changes[firstAdditionIndex - 1] is KeyedChangeType.Addition))
--firstAdditionIndex;
capturedItems.AddRange(changeSet.Changes[firstAdditionIndex..], static change => change.AsAddition());
}
}
break;
case ChangeSetType.Update:
foreach(var change in changeSet.Changes)
switch(change.Type)
{
case KeyedChangeType.Addition:
capturedItems.Add(change.AsAddition());
break;
case KeyedChangeType.Removal:
capturedItems.Remove(change.AsRemoval().Key);
break;
case KeyedChangeType.Replacement:
capturedItems.Replace(change.AsReplacement());
break;
default:
throw new InvalidOperationException($"Unsupported {nameof(KeyedChange)} type {change.Type} within {nameof(KeyedChangeSet)} of type {changeSet.Type}");
}
break;
default:
throw new InvalidOperationException($"Unsupported {nameof(KeyedChangeSet)} type {changeSet.Type}");
}
The processing of a Reset
could maybe be improved. We could add an index field to the struct to indicate where the additions start, when the type is Reset
, so consumers wouldn't have to search for it. Then maybe we could expose GetResetRemovals()
and GetResetAdditions()
methods that return ReadOnlySpan<>
s.
I wanted to lay out my thought processes for the different prototype changeset models I wrote up here.
First, the benchmark results, for reference:
Polymorphic ChangeSets
This came from the first time I tried, intuitively, to do something DynamicData-like in a personal project, a while ago, long before I came across and started investing into DynamicData. Basic concept was to do what
System.Collections.ObjectModel.ObservableCollection<T>
does, but in RX land. Each "action" that you can take upon a collection gets rolled up into an object class and published, and we use polymorphism (I.E. a base class) to tie them all together. When I came back to this idea more recently, and wrote it up more broadly, to mimic what DynamicData can do, I came up with everything here. To summarize...So, each different type of collection operation gets its own unique implementation of the interface. Add, AddRange, Remove, Clear, etc. For example...
This would be one way to get the "optimization for single-change changesets" we keep talking about, since those classes don't have to do any array allocations, they can just track the single-item-in-question as a plain field.
I think this kinda falls apart when the implementation still requires all changesets to implement
IEnumerable<>
for changes, which I figure is necessary to support all the crazy types of operator combinations you can get in DynamicData. After combining many operators together, you're much more likely to have changesets that don't fit into any of the specific Add/AddRange/Remove/Replace/etc. operations, and instead just have an intermixing of different types of changes. The more of those that end up getting generated, the more we're just approaching having a plain changeset model similar to today's, and we gain little benefit out of writing all the polymorphic classes if they're not getting regularly USED in a polymorphic manner. Plus, the effort to write operators to leverage the polymorphism (I.E. check what type each changeset is, and cast to that) was greatly increased, as there's 7-9 different types of changesets to account for, depending on whether you're talking about Keyed changes or Sorted changes.I'm on board with doing extra work on our end, if it means things are simpler and more performant on the consumer's end, but that's not really what happened. Benchmarks showed that this approach was only the third-fastest out of the 4 concepts I tried, and was the best for memory usage, but not by much.
Conclusion
Ultimately, I'm not convinced the extra effort is worth the gains for this implementation. Not unless we could take another crack at it and improve the speed.
Consumer-Interpreted Changes
This is the closest to what DynamicData has today, with the main difference being the use of
ImmutableArray<>
to list changes, instead ofList<T>
/IReadOnlyList<T>
. I call this "consumer interpreted" because theChange
types themselves have very little logic in them, and it's left to the consumer to implement correct "interpretation" of the changes. I.E. the consumer has to know thatOldItem
is only populated forReplace
andRemove
changes, or whatever the rules are. The same principle would be applied if we hadOldIndex
andNewIndex
values, like DynamicData has for keyed changesets, the consumer is responsible for knowing when those are supposed to be populated, and when they're not.Summarizing the full implementation from here...
At the time, I thought it made sense for
NewItem
to always refer to items going INTO the collection, andOldItem
to always refer to items coming out, I.E.NewItem
is optional because it would not be populated forRemove
changes,OldItem
would instead. In retrospect, I'm not convinced this is a better approach than what DynamicData does, whereCurrent
is ALWAYS populated, andPrevious
is only used for replacements. If nothing else, it reduces the struct size by 4 bytes by not having abool
embedded inNewItem
.I implemented this version really only to see what the differences in performance would be, compared to the
ImmutableArrayChanges
version, whose only difference is having more logic in theChange
structs, and it's basically the same. The speed and memory usage differences are within the margin of error.Conclusion
This option really is a "developer experience" choice. I.E. do we think we can do better?
Type-Interpreted Changes
This is actually the
ImmutableArrayChanges
namespace in the demo, even though ALL of the prototypes useImmutableArray<>
to list changes, except forDynamicDataChangeSets
andIReadOnlyListChanges
.The idea here is to take away ambiguity as much as possible, in how consumers use the
Change
types. Maybe it's overkill, since I'm sure very few consumers outside ofDynamicData
are actually rolling their own operators, but I think another way to look at it is that making changes "easier" (hopefully) to interpret an create might actually encourage consumers to do more of their own customizations on top of DynamicData.I would say the core idea of this approach is actually the same as the "Polymorphic ChangeSets" approach: force the consumer to determine what type of change this is, before they can access any of it's state. Except, we can't actually DO polymorphism with
structs
, without allocating, and I think there's no question that we want to avoid allocations anywhere in theChange
layer.So, I came up with the following:
Looking at it from a different perspective, this is a fancy way to reduce the size of the
KeyedChange
struct by another 4 bytes by swappingNewItem
fromOptional<TItem>
to justTItem
, since all the information you need to know whetherNewItem
and OldItemare populated is actually inside the
KeyedChangeType Typevalue. And by exposing the item and key data only through those
.AsXXX()methods, it forces the consumer to check
Type` first, to figure out which one to call.I implemented the
.AsXXX()
methods tothrow
if they're called for the wrong change type, but we could even do alternate versions that skip the safety check, for performance. Maybe keep those asinternal
only, or maybe just make it clear in the documentation and naming that they're dangerous.I also was wondering recently if there's a way to leverage
ref struct
s or some other mechanism for the.AsXXX()
methods, as a way to maybe provide a "window" over the state values of theChange
struct, instead of having them copied out. I'm not terribly familiar with all that stuff.Conclusion
This is the implementation I lean towards, as it came in as the fastest (although, only within the margin of error) and the second-best for memory usage. And as I mentioned above, there might be a few more tweaks possible to improve it.
IReadOnlyList<> Changes
Pretty self explanatory, this one: it's identical to the last one, but with
ImmutableArray<>
swapped out forIReadOnlyList<>
. Really, just to demonstrate thatIReadOnlyList<>
is indeed slower, although not by a whole lot. However, I'll wager that the reason for "not by a whole lot" is because I put extra work into never usingforeach
over the lists, and always doing a plainfor
instead.Implementation here.
Conclusion
ImmutableArray<>
seems to be categorically better thanIReadOnlyList<>
so long as we take the care to pre-allocate as much as we can, as I've done in the demo implementation.Final Conclusion
It seems pretty compelling that basically all 4 of the alternate implementations I came up with perform about twice as fast, or better, than the DynamicData equivalents, and use less memory.
What do y'all think?