mbdavid / LiteDB

LiteDB - A .NET NoSQL Document Store in a single data file
http://www.litedb.org
MIT License
8.35k stars 1.22k forks source link

Changing the DisposableAction to reduce allocations #2495

Closed pictos closed 2 weeks ago

pictos commented 3 weeks ago

This PR just changes the DisposableAction to be a readonly struct and the extension method to not return the IDisposable interface, those changes reduce the allocation in by 27%.

Previous: image

Changed: image

JKamsker commented 3 weeks ago

Well, yes, structs can be faster, but also slower when they have to be moved to the heap.

There are a lot of awaits in between, and it's hard to believe that there are only sync continuations.

Also, the disposable stopwatch is used in an IEnumerable state machine. That means it will be put on the heap. Have you tried your benchmarks with yields in between?

JKamsker commented 3 weeks ago

Also: Do you think its beneficial to just put stopwatch into the disposable action and rename it to StopwatchDisposer or smth like that?

pictos commented 3 weeks ago

Well, yes, structs can be faster, but also slower when they have to be moved to the heap. There are a lot of awaits in between, and it's hard to believe that there are only sync continuations.

Well, that depends, struct are expensive when they're big and passed by value. Because of that the async machinery use a lot of ref (you may know the most pieces are structs) to box and pass its members way.

The struct, that we're using here, just holds an Action reference so they shouldn't be so expensive to pass by value. But since it's always good to not trust on compiler in our head I added some IEnumerable benchmark to confirm that.

Also: Do you think its beneficial to just put stopwatch into the disposable action and rename it to StopwatchDisposer or smth like that?

TBH no, having it receiving an Action is better, because will make it more general, so we can use it in other scenarios. On C# 13 we can (try to) change to a ref struct, since they're better for it and will not be promoted to the heap. Not sure how is the automatic translation, since the article is in pt-br, but I've an article about this approach using ref struct here.

JKamsker commented 3 weeks ago

I have added my own benchmarks. Well, i'd say the performance improvements are negligeable

| Method                       | Job                  | Runtime              | Mean         | Error      | StdDev     | Median       | Ratio | RatioSD | Gen0   | Allocated | Alloc Ratio |
|----------------------------- |--------------------- |--------------------- |-------------:|-----------:|-----------:|-------------:|------:|--------:|-------:|----------:|------------:|
| Struct_SumWithRangeHavingSw  | Job-IWLYFU           | .NET 6.0             | 4,717.268 ns | 62.4304 ns | 58.3974 ns | 4,709.875 ns |  0.85 |    0.02 | 0.0153 |     168 B |        0.99 |
| Struct_SumWithRangeHavingSw  | .NET 6.0             | .NET 6.0             | 4,716.725 ns | 66.0657 ns | 61.7979 ns | 4,706.924 ns |  0.85 |    0.02 | 0.0153 |     168 B |        0.99 |
| Struct_SumWithRangeHavingSw  | .NET Framework 4.7.2 | .NET Framework 4.7.2 | 5,564.845 ns | 95.5894 ns | 89.4144 ns | 5,533.903 ns |  1.00 |    0.00 | 0.0229 |     169 B |        1.00 |
|                              |                      |                      |              |            |            |              |       |         |        |           |             |
| Class_SumWithRangeHavingSw   | Job-IWLYFU           | .NET 6.0             | 4,656.619 ns | 54.4246 ns | 50.9088 ns | 4,657.379 ns |  0.80 |    0.01 | 0.0229 |     192 B |        0.99 |
| Class_SumWithRangeHavingSw   | .NET 6.0             | .NET 6.0             | 4,776.531 ns | 93.5817 ns | 82.9577 ns | 4,796.747 ns |  0.82 |    0.02 | 0.0229 |     192 B |        0.99 |
| Class_SumWithRangeHavingSw   | .NET Framework 4.7.2 | .NET Framework 4.7.2 | 5,800.899 ns | 59.6063 ns | 55.7558 ns | 5,799.857 ns |  1.00 |    0.00 | 0.0305 |     193 B |        1.00 |

24 bytes more in allocations. The mean is sometimes faster and sometimes slower than the struct version.

JKamsker commented 2 weeks ago

Tbh, I do not see any real use for this. I mean, yes it can be useful in situations where it crosses no yields and only stays local but here it is not the case. Either refactor the code that uses the original to not use it anymore (ugly nested tryf) or leave it as is.