Closed twop closed 2 years ago
Used ValueOption instead of option, and used Span instead of ArraySlice when appropriate
here is the latest results (note slight increase in memory consumption but slightly better perf)
Method | depth | Mean | Error | StdDev | Gen 0 | Gen 1 | Gen 2 | Allocated |
---|---|---|---|---|---|---|---|---|
ProcessMessages | 10 | 147.5 ms | 1.66 ms | 1.55 ms | 42750.0000 | 16500.0000 | 1500.0000 | 138 MB |
ProcessMessages | 15 | 2,910.4 ms | 36.78 ms | 34.40 ms | 476000.0000 | 158000.0000 | 61000.0000 | 1,536 MB |
here is the latest results (note slight increase in memory consumption but slightly better perf)
Not sure if it's really worth it. We have slightly worse perf and memory on low depth, and barely noticeable win on perf compared to what we will pay in GC pause for bigger depth.
What do you think?
here is the latest results (note slight increase in memory consumption but slightly better perf)
Not sure if it's really worth it. We have slightly worse perf and memory on low depth, and barely noticeable win on perf compared to what we will pay in GC pause for bigger depth.
What do you think?
I think it is, even though it is more memory it is easier GC and faster runtime perf. Note that on M1 cache misses are less severe because of large cache and really wide CPU lines.
So I think it is better, I'm curious how it is going to be on mobile devices
@TimLariviere I believe that I resolves/fixed all comments. Please take a look once more time. Happy to fix any other issues/concerns
Note depends on #36 fixes: https://github.com/TimLariviere/Fabulous-new/issues/17
Motivation
One of the goals of Fabulous is to be fast, in other words, produce minimal amount of overhead on top of underlying UI Framework (like XF or MAUI).
The way I see it we need to optimize 3 things:
This PR is mostly concerned with 2 and 3
Approach
What is done
Stack allocated collections
MutStackArray1
MutStackArray1
, in that case both arrays are considered to be "consumed"Yield
extension to work with Computation Expressions(usedElmCount, array)
ArraySlice
(usedElmCount, array)
MutStackArray1
struct (uint16, array<'T>)
DiffBuilder
uint16
viastackalloc
of fixed sized (8)(add | remove | change * index)
, where operation is encoded into 2 bits and 14 is reserved for the indexStackArray3
StackList
)StackList
1, 2, 3, 4
to an emptyStackList
will result inStack: 1
Stack: 1, 2
Stack: 1, 2, 3
Heap: (1, 2, 3), Stack: 4
<— fist allocationWidgetBuilder
for storing Scalar attributes. The assumption that most Widgets won’t have more that 3 scalars most of the time, and the ones that do will allocate just 1-2 times (at 4th and 7th elements respectively)Usage summary
StackList
inWidgetBuilder
DiffBuilder
for diffing scalar attributesMutStackArray1
inCollectionBuilder
,AttributeCollectionBuilder
and the rest of the diffing in ReconcilerStackArray3
currently not used because it is inferior toStackList
Other changes
Widget
,WidgetDiff
now have eithervoption
oroption
as fields to avoid allocationsArray.tryFind
allocates anOption
if it holds a value type (structs), replaced with handwritten checkWidgetBuilder
now has many constructors that mostly replaceViewHelper
functions needed for thatMutStackArray1
inYield
)MutStackArray1
ctorinternal
for safety reasons. Although tricky with CE inliningArraySlice
is now widely used, often in combination withMutStackArray1
as an output (for example in Reconciler)Benchmark
Finally the numbers!
Not only the diffing got significantly faster but now it allocates almost %50 less memory.
Note that the memory numbers below are taken with a different growth function. Now the growth rate of
MutStackArray1
is 1.5 (note that it is less thanResizeArray
), it is possible that we should be even more conservative and use1.3
because most of our collections are small.Summary
ProcessMessages
benchmarkBefore
After
On main branch
After these changes