dotnet / runtime

.NET is a cross-platform runtime for cloud, mobile, desktop, and IoT apps.
https://docs.microsoft.com/dotnet/core/
MIT License
15.05k stars 4.69k forks source link

Provide a ValueLazy<T> type #49978

Open tannergooding opened 3 years ago

tannergooding commented 3 years ago

Background and Motivation

Today, you can use Lazy<T> to help perform lazy initialization of types. Lazy<T> itself contains an optional delegate, a LazyHelper field that is a cached static readonly instance, and T. This means just using Lazy<T> requires 1-3 allocations and and any access to the underlying T involves at least one additional indirection. Usages are typically private to the type and while the underlying T might be exposed, the Lazy<T> instance itself is not. Given its usage, this seems like unnecessary overhead and providing a value equivalent would allow reduced allocations (particularly when the value is never instantiated) and faster access to the underlying value once it has been created.

Proposed API

public struct ValueLazy<T>
{
    public ValueLazy();
    public ValueLazy(bool isThreadSafe);
    public ValueLazy(Func<T> valueFactory);
    public ValueLazy(LazyThreadSafetyMode mode);
    public ValueLazy(T value);
    public ValueLazy(Func<T> valueFactory, bool isThreadSafe);
    public ValueLazy(Func<T> valueFactory, LazyThreadSafetyMode mode);

    public bool IsValueCreated { get; }
    public T Value { get; }
}

Usage Examples

You would use this exactly as you would an instance of Lazy<T>. Lazy<T> itself could be implemented as a simple wrapper over ValueLazy<T>, allowing the APIs to be versioned in sync without code duplication.

Alternative Designs

Libraries can expose their own ValueLazy<T> type if they feel it is worthwhile.

Risks

As with any mutable value type, there is risk of it being used incorrectly if it is marked readonly or if it is passed around by value. Given the typical use of Lazy<T>, I don't think this is a concern.

dotnet-issue-labeler[bot] commented 3 years ago

I couldn't figure out the best area label to add to this issue. If you have write-permissions please help me learn by adding exactly one area label.

davidfowl commented 3 years ago

We should call it FastLazy<T>, that's how you market a type and make everyone use it 😄

tfenise commented 3 years ago

How does it compare to the static methods System.Threading.LazyInitializer.EnsureInitialized?

stephentoub commented 3 years ago

This means just using Lazy requires 1-3 allocations and and any access to the underlying T involves at least one additional indirection

As written, would your proposal really move the needle? It'll save one small allocation, for the Lazy<T> itself. Assuming an implementation similar to Lazy<T>, new ValueLazy<T>(() => ...) will still incur the LazyHelper allocation, creating one that closes over state will still incur at least a delegate allocation if not also a closure (or best case accessing a static field to access a cached delegate), accessing Value will still require a delegate invocation, etc.

tannergooding commented 3 years ago

How does it compare to the static methods System.Threading.LazyInitializer.EnsureInitialized?

It avoids you needing to declare and manage multiple separate fields and to call the relevant initialization methods yourself. Instead, it provides everything in a simple to use package.

tannergooding commented 3 years ago

As written, would your proposal really move the needle?

@stephentoub, I think so, yes.

will still incur the LazyHelper allocation

The LazyHelper is a shared allocation in most setups and only allocates a custom one for ExecutionAndPublication mode so that it can be used in a lock statement.

creating one that closes over state will still incur at least a delegate allocation if not also a closure (or best case accessing a static field to access a cached delegate)

This means in the typical setup, you will have 1 allocation (the delegate) that lives up until the value is actually created. There are many ways to ensure delegates don't themselves incur additional allocations for closures when this is important.

accessing Value will still require a delegate invocation

This only occurs once, when the value needs to be created. Afterwards it is simply returning the field and the underlying delegate is nulled out so it can be collected. That being said, I think the more important detail here is the reduced indirection for accessing the underlying data after it has been created and for checking if the value has been created at all.

If you look at existing use cases (https://source.dot.net/#System.Private.CoreLib/Lazy.cs,8b99c1f377873554,references) a fairly typical setup looks to be:

public class MyClass
{
    private Lazy<string> _stringValue;
    private Lazy<int> _intValue;

    public MyClass()
    {
        _stringValue = new Lazy<string>(ExpensiveCall1);
        _intValue = new Lazy<int>(ExpensiveCall2);
    }

    public string StringValue => _stringValue.Value;

    public int Int32Value => _intValue.Value;
}

This means that you have a class, wrapping a class, wrapping a value and that to access the underlying value or even to see if the underlying value has been initialized you must go through an additional layer of indirection. These additional indirections add up and can hurt perf, particularly for frequently accessed values or deep chains of Lazy.

One example of where this can be particularly problematic is when you are creating managed wrappers over native handles. If you take ClangSharp for example, it effectively wraps the Clang AST bindings in .NET. Being native bindings, the underlying C++ types are exposed to C# as opaque handles with corresponding C style methods taking the handle as the first parameter.

Given that this is an AST for C++, you can end up with a lot of metadata and very deep trees of objects. In order to speed up perf and avoid needlessly allocating an object for everything in the tree, you use Lazy<T> and a Dictionary<Handle, ManagedWrapper> to ensure that only the traversed sections of the AST get wrappers created and to allow a single shared allocation per underlying handle.

This works nicely and does provide a good speedup plus reduced allocations. However, this also means that to traverse the tree, you get one additional indirection at every layer. Having a ValueLazy<T> allows for the same setup, the same overall overhead, but without the additional indirection at every step.


.NET has explicit concepts for "value type" vs "reference type" and there is no way to say "I want this reference type to be inlined so I can avoid an indirection". We are also very much reference heavy, which is generally a good thing and the desired behavior for publicly exposed data.

The downside to this is that when the data is an internal implementation detail, we have potentially many additional indirections to access data when all you really wanted was a thin wrapper over T or T[] plus some basic logic for manipulating it.

For example, if you wanted to have a custom collection type, you are probably wrapping List<T> and your custom collection type is itself a reference type. That collection type is itself likely meant for reuse in another reference type. Now, to access your data you have to go ref (MyClass)->ref (CustomCollection)->ref (List<T>)->ref (T[]) when the practical use case is either ref (MyClass)->ref (T[]) (the collection is an internal implementation detail) or ref (MyClass)->ref (CustomCollection)->ref (T[]) (the collection is publicly exposed somewhere). -- https://source.dot.net/#PresentationFramework/System/Windows/Controls/Panel.cs,181

jkotas commented 3 years ago

There are more than 5 lazy initialization patterns today:

These patterns cover many points on the convenience/performance spectrum.

Do we really need to have 6th pattern? This proposal looks very close to LazyInitializer.EnsureInitialized.

stephentoub commented 3 years ago

and only allocates a custom one for ExecutionAndPublication mode so that it can be used in a lock statement.

And that's the default and what 99% of folks use.

This means in the typical setup, you will have 1 allocation (the delegate) that lives up until the value is actually created.

Typical use also has a closure allocation.

sharwell commented 3 years ago

I'm concerned about the difficulty of using ValueLazy<T> correctly without support for non-copyable value types. LazyInitializer.EnsureInitialized makes it very difficult to misuse this pattern under the current compiler limitations.

tannergooding commented 3 years ago

This proposal looks very close to LazyInitializer.EnsureInitialized.

Yes, it is similar, but in a convenient to use and easy to locate type. At least I didn't know LazyInitializer existed until today, and based on the hearts/thumbs up, I'd guess at least one other person didn't either 😄 At the very least, LazyInitializer looks to be missing a (ref T, ref bool) overload; it only has (ref T, ref bool, ref object) and (ref T, ref bool, ref object, System.Func). The other overloads don't take a bool and require T to be a reference type. Likewise, it only has options for a ExecutionAndPublication or None, it doesn't support PublicationOnly safety mode for when that is appropriate.

I'd be fine with expanding LazyInitializer to have the missing overload so you can avoid the lock. But I still think ValueLazy<T> is much nicer and easy to use as compared to private (bool IsInitialized, T Value) _value;

And that's the default and what 99% of folks use.

There are options, however, to reduce these allocations and indirections for when that is important. Users can avoid the lock and the closure if it is important (e.g. multithreaded access is a non-concern), it is not as obvious on how to avoid the indirection/allocation of Lazy<T> itself since its hidden away in static class in System.Threading.

I'm concerned about the difficulty of using ValueLazy correctly without support for non-copyable value types

I agree there is a pit of failure with mutable value types. However, we have this same "pit" in many other cases and existing analyzers that flag when they are marked readonly or potentially used incorrectly (such as for SpinLock) and which could be extended to support this.

stephentoub commented 3 years ago

it doesn't support PublicationOnly safety mode for when that is appropriate.

Sure it does. That's the primary behavior of EnsureInitialized; it's what all the overloads that don't take a lock object use.

At least I didn't know LazyInitializer existed until today, and based on the hearts/thumbs up, I'd guess at least one other person didn't either 😄

I don't see how adding yet another type makes it more discoverable.

LazyInitializer looks to be missing a (ref T, ref bool) overload

What do you expect that overload to do?

There are options, however, to reduce these allocations and indirections for when that is important. Users can avoid the lock and the closure if it is important (e.g. multithreaded access is a non-concern), it is not as obvious on how to avoid the indirection/allocation of Lazy itself since its hidden away in static class in System.Threading.

Lazy<T> generally saves you very little code. If you're profiling and seeing particular costs associated with Lazy<T> and finding ways to remove them, I highly doubt you're going to be happy just by replacing Lazy<T> with ValueLazy<T>.

tannergooding commented 3 years ago

I don't see how adding yet another type makes it more discoverable.

You type Lazy and you see ValueLazy in Intellisense. ValueX being an allocation free version of X with possibly a few different rules is not an uncommon pattern now. Other types with Lazy in the name are LazyInitializer and LazyThreadSafetyMode, neither of which are "obviously" what you want.

What do you expect that overload to do?

To check bool and create the instance without a lock. Given the current code, it would be effectively:

if (!Volatile.Read(ref initialized))
{
    target = valueFactory();
    Volatile.Write(ref initialized, true);
}

or

if (!Volatile.Read(ref initialized))
{
    try
    {
        target = Activator.CreateInstance<T>();
    }
    catch (MissingMethodException)
    {
        throw new MissingMemberException(SR.Lazy_CreateValue_NoParameterlessCtorForT);
    }

    Volatile.Write(ref initialized, true);
}

depending on if you wanted the default constructor or Func<T>.

Lazy generally saves you very little code. If you're profiling and seeing particular costs associated with Lazy and finding ways to remove them, I highly doubt you're going to be happy just by replacing Lazy with ValueLazy.

I'll collect some numbers for ClangSharp and possibly another one of my interop projects. I had previously created my own ValueLazy<T> type which simply used SpinWait and a volatile field because the constructors were small (basically a dictionary lookup and a constructor call that assigns an IntPtr field) and the overhead was measurable for the indirections. It also helped cut down overall memory usage by a bit since every object is at least 30 some bytes due to internal overhead.

stephentoub commented 3 years ago

To check bool and create the instance without a lock. Given the current code, it would be effectively:

Not for value types in general (you'd risk tearing them), and for reference types you don't need the bool at all unless you actually want to treat null as a valid value.

You type Lazy and you see ValueLazy in Intellisense

You type Lazy and you see LazyInitializer, too.

ValueX being an allocation free version of X

It's not allocation-free. The 99% case is still going to allocate.

tannergooding commented 3 years ago

Not for value types in general (you'd risk tearing them)

And this is completely acceptable for certain cases and is possible with Lazy<T>, but not with LazyInitializer

and for reference types you don't need the bool at all unless you actually want to treat null as a valid value

LazyInitializer exposes overloads that don't take bool that are constrained where T : class and which support having a lock or not having one. The overloads that take bool are unconstrained (and therefore support value or reference types) but require a lock.

It's not allocation-free. The 99% case is still going to allocate.

Sorry, I mean allocation free as in no additional allocation required to represent the state. You aren't forced to have an allocation or indirection to contain the other state/allocations. Likewise, if you don't need a lock and you just need new T() it can be truly allocation free or if the delegate doesn't capture state, it can be cached and reused to reduce the remaining allocations.

stephentoub commented 3 years ago

And this is completely acceptable for certain cases and is possible with Lazy, but not with LazyInitializer

You won't get tearing with Lazy<T>, unless you erroneously use the non-thread-safe mode from multiple threads. Tearing is not acceptable: we will not add something that claims to be thread-safe but might tear value types.

LazyInitializer exposes overloads that don't take bool that are constrained where T : class and which support having a lock or not having one. The overloads that take bool are unconstrained (and therefore support value or reference types) but require a lock.

I designed LazyInitializer; I understand what it looks like. :smile:

Are you saying you want an overload on LazyInitializer that uses a spin lock instead of an object? What exactly do you want it to spin watching?

Sorry, I mean allocation free as in no additional allocation required to represent the state.

new Lazy<T>(() => something) allocates a LazyHelper; that would still be the case for a ValueLazy<T> in order to "represent the state". That is the 99.9% use case. Very little code that uses lazy selects a different execution mode.

tannergooding commented 3 years ago

I designed LazyInitializer; I understand what it looks like.

Sorry, was just trying to explain that I understood you didn't need bool for reference types and that's why we had two sets of overloads. It's just that there is functionality supported by Lazy<T> which is not exposed on LazyInitializer.

Having that functionality available via LazyInitializer, now that I know it exists would be reasonable. Having it exposed via ValueLazy<T> ensures parity without questions of "what are the safety guarantees" (because it's covered by ThreadSafetyMode, same as in Lazy<T> where you can tear under None) and covers my own use case and potentially use cases of other users.

Are you saying you want an overload on LazyInitializer that uses a spin lock instead of an object? What exactly do you want it to spin watching?

That would be reasonable and would at least cover the "missing" functionality for PublicationOnly. It could take an int or an enum LazyInitializationState?

new Lazy(() => something) allocates a LazyHelper; that would still be the case for a ValueLazy in order to "represent the state". That is the 99.9% use case. Very little code that uses lazy selects a different execution mode.

My general point is that when you want or need a different execution mode, ValueLazy<T> would allow you to do it and avoid the allocations.

In the case of ClangSharp, its mostly all Lazy<SomeReferenceType> and these are all simple wrappers over a IntPtr Handle and other Lazy<T> or List<Lazy<T>>, depending on what children the AST node can have. The execution mode can be PublicationOnly and the create methods are all able to be function pointers or cached delegates because they simply do GetOrCreate<SomeReferenceType>(handle) (which internally does a lookup in a Dictionary<Handle, SomeReferenceType> and effectively calls new SomeReferenceType(handle) if an existing allocation doesn't exist.

Now that I know LazyInitializer exists, it looks like I can use SomeReferenceType EnsureInitialized(ref SomeReferenceType, Func<SomeReferenceType> valueFactory). The downside here is that two racing threads may each call valueFactory and the first one wins, so there may be an unnecessary short lived allocation. A SpinWait could avoid that by only allowing the thread that gets the Initializing state to call valueFactory() and also handles the case where null is valid (which sometimes happens in the AST).

stephentoub commented 3 years ago

That would be reasonable and would at least cover the "missing" functionality for PublicationOnly. It could take an int or an enum LazyInitializationState?

Do you mean a ref to one? i.e. you want the caller to pass in a location that the implementation would spin on?

My general point is that when you want or need a different execution mode, ValueLazy would allow you to do it and avoid the allocations.

Just one of them.

tannergooding commented 3 years ago

Do you mean a ref to one? i.e. you want the caller to pass in a location that the implementation would spin on?

Yes, rather than ref bool you would pass in ref int or ref LazyInitializationState and that would be used in the SpinWait.

mburbea commented 3 years ago

I would think a better way to avoid allocation for the 90% case would be to implement #26255. As pretty much any non-trivial valueFactory is likely to need state.

// Allocates a Lazy<string>, a delegate, and a closure to capture the state. 
private readonly Lazy<string> _lazyValue = new Lazy<string>(()=>ExpensiveCall())
public string LazyValue => _lazy.Value;

// Only allocation is the delegate, and that should be reusable for any instantiation of this type. 
private string _value;
public string LazyValueAlt => LazyInitializer(ref _value, this, t => t.ExpensiveCall());

With C#10 proposed field keyword, this will become even more concise, as we no longer need to declare a separate field to store the state.

For cases where null is a legitimate value or you are lazy initializing a struct, then yes you still need to declare a bool and an object synclock, but I feel those are the more rare usages of this api.

ZacharyPatten commented 3 years ago

I have a SLazy<T>+ValueLazy<T>, and I ran banchmarks on this topic.

Towel.SLazy<T> Source Code
Towel.ValueLazy<T> Source Code
TerraFX.ValueLazy<T> Source Code

TerraFX: 0.1.0-alpha-1021147070
Towel: 1.0.36

Initialization Benchmark (first time .Value is called) [Click To Expand]

``` ini BenchmarkDotNet=v0.13.0, OS=Windows 10.0.19042.1110 (20H2/October2020Update) Intel Core i7-4790K CPU 4.00GHz (Haswell), 1 CPU, 8 logical and 4 physical cores .NET SDK=6.0.100-preview.6.21355.2 [Host] : .NET 5.0.8 (5.0.821.31504), X64 RyuJIT Job-FXODVA : .NET 5.0.8 (5.0.821.31504), X64 RyuJIT InvocationCount=1 UnrollFactor=1 ``` | Method | N | Mean | Error | StdDev | Median | Ratio | RatioSD | |----------------- |------ |-------------:|-------------:|-------------:|-------------:|------:|--------:| | **Lazy** | **1** | **339.8 ns** | **17.57 ns** | **51.25 ns** | **300.0 ns** | **1.00** | **0.00** | | TowelSLazy | 1 | 323.0 ns | 21.48 ns | 63.33 ns | 300.0 ns | 0.97 | 0.24 | | TowelValueLazy | 1 | 250.0 ns | 0.00 ns | 0.00 ns | 250.0 ns | 0.75 | 0.10 | | TerraFXValueLazy | 1 | 183.3 ns | 12.98 ns | 37.46 ns | 200.0 ns | 0.55 | 0.15 | | | | | | | | | | | **Lazy** | **10** | **675.0 ns** | **16.22 ns** | **43.56 ns** | **700.0 ns** | **1.00** | **0.00** | | TowelSLazy | 10 | 400.0 ns | 0.00 ns | 0.00 ns | 400.0 ns | 0.60 | 0.04 | | TowelValueLazy | 10 | 500.0 ns | 0.00 ns | 0.00 ns | 500.0 ns | 0.73 | 0.04 | | TerraFXValueLazy | 10 | 400.0 ns | 0.00 ns | 0.00 ns | 400.0 ns | 0.60 | 0.04 | | | | | | | | | | | **Lazy** | **100** | **2,968.2 ns** | **58.32 ns** | **71.62 ns** | **3,000.0 ns** | **1.00** | **0.00** | | TowelSLazy | 100 | 2,287.5 ns | 34.78 ns | 34.16 ns | 2,300.0 ns | 0.77 | 0.02 | | TowelValueLazy | 100 | 2,350.0 ns | 48.94 ns | 76.20 ns | 2,300.0 ns | 0.79 | 0.03 | | TerraFXValueLazy | 100 | 2,454.2 ns | 50.60 ns | 65.80 ns | 2,400.0 ns | 0.83 | 0.03 | | | | | | | | | | | **Lazy** | **1000** | **24,991.7 ns** | **115.32 ns** | **90.03 ns** | **25,000.0 ns** | **1.00** | **0.00** | | TowelSLazy | 1000 | 20,000.0 ns | 67.06 ns | 74.54 ns | 20,000.0 ns | 0.80 | 0.01 | | TowelValueLazy | 1000 | 20,530.8 ns | 57.53 ns | 48.04 ns | 20,500.0 ns | 0.82 | 0.00 | | TerraFXValueLazy | 1000 | 22,578.6 ns | 307.12 ns | 272.25 ns | 22,600.0 ns | 0.90 | 0.01 | | | | | | | | | | | **Lazy** | **10000** | **221,810.5 ns** | **11,143.49 ns** | **31,972.77 ns** | **210,000.0 ns** | **1.00** | **0.00** | | TowelSLazy | 10000 | 193,483.3 ns | 436.35 ns | 340.68 ns | 193,400.0 ns | 0.77 | 0.06 | | TowelValueLazy | 10000 | 198,800.0 ns | 905.69 ns | 707.11 ns | 198,550.0 ns | 0.79 | 0.06 | | TerraFXValueLazy | 10000 | 223,676.9 ns | 1,412.01 ns | 1,179.09 ns | 223,000.0 ns | 0.89 | 0.07 |

Caching Benchmark (calling .Value multiple times) [Click To Expand]

``` ini BenchmarkDotNet=v0.13.0, OS=Windows 10.0.19042.1110 (20H2/October2020Update) Intel Core i7-4790K CPU 4.00GHz (Haswell), 1 CPU, 8 logical and 4 physical cores .NET SDK=6.0.100-preview.6.21355.2 [Host] : .NET 5.0.8 (5.0.821.31504), X64 RyuJIT DefaultJob : .NET 5.0.8 (5.0.821.31504), X64 RyuJIT ``` | Method | N | Mean | Error | StdDev | Median | Ratio | RatioSD | |----------------- |------- |-------------:|-----------:|-----------:|-------------:|------:|--------:| | **Lazy** | **1** | **29.74 ns** | **0.619 ns** | **1.631 ns** | **28.93 ns** | **1.00** | **0.00** | | TowelSLazy | 1 | 24.81 ns | 0.522 ns | 0.981 ns | 24.79 ns | 0.84 | 0.05 | | TowelValueLazy | 1 | 19.20 ns | 0.375 ns | 0.695 ns | 18.83 ns | 0.65 | 0.04 | | TerraFXValueLazy | 1 | 29.69 ns | 0.214 ns | 0.190 ns | 29.63 ns | 0.95 | 0.03 | | | | | | | | | | | **Lazy** | **10** | **33.20 ns** | **0.158 ns** | **0.132 ns** | **33.17 ns** | **1.00** | **0.00** | | TowelSLazy | 10 | 28.04 ns | 0.573 ns | 1.004 ns | 27.84 ns | 0.86 | 0.04 | | TowelValueLazy | 10 | 23.43 ns | 0.404 ns | 0.567 ns | 23.34 ns | 0.72 | 0.02 | | TerraFXValueLazy | 10 | 39.51 ns | 0.072 ns | 0.064 ns | 39.52 ns | 1.19 | 0.01 | | | | | | | | | | | **Lazy** | **100** | **85.37 ns** | **0.496 ns** | **0.464 ns** | **85.30 ns** | **1.00** | **0.00** | | TowelSLazy | 100 | 74.67 ns | 0.345 ns | 0.322 ns | 74.64 ns | 0.87 | 0.01 | | TowelValueLazy | 100 | 69.18 ns | 0.451 ns | 0.376 ns | 69.17 ns | 0.81 | 0.01 | | TerraFXValueLazy | 100 | 121.87 ns | 0.417 ns | 0.348 ns | 121.79 ns | 1.43 | 0.01 | | | | | | | | | | | **Lazy** | **1000** | **502.77 ns** | **2.989 ns** | **2.796 ns** | **502.66 ns** | **1.00** | **0.00** | | TowelSLazy | 1000 | 505.46 ns | 10.143 ns | 18.029 ns | 493.86 ns | 1.03 | 0.04 | | TowelValueLazy | 1000 | 485.01 ns | 1.958 ns | 1.736 ns | 484.72 ns | 0.96 | 0.01 | | TerraFXValueLazy | 1000 | 953.46 ns | 2.692 ns | 2.518 ns | 952.51 ns | 1.90 | 0.01 | | | | | | | | | | | **Lazy** | **10000** | **4,802.94 ns** | **93.329 ns** | **107.477 ns** | **4,770.75 ns** | **1.00** | **0.00** | | TowelSLazy | 10000 | 4,650.32 ns | 25.513 ns | 21.304 ns | 4,641.85 ns | 0.97 | 0.02 | | TowelValueLazy | 10000 | 4,840.35 ns | 95.440 ns | 172.099 ns | 4,847.97 ns | 1.01 | 0.04 | | TerraFXValueLazy | 10000 | 9,355.54 ns | 179.173 ns | 175.971 ns | 9,290.58 ns | 1.95 | 0.06 | | | | | | | | | | | **Lazy** | **100000** | **46,398.71 ns** | **312.350 ns** | **260.826 ns** | **46,301.40 ns** | **1.00** | **0.00** | | TowelSLazy | 100000 | 46,357.69 ns | 330.515 ns | 292.993 ns | 46,283.58 ns | 1.00 | 0.01 | | TowelValueLazy | 100000 | 46,346.54 ns | 235.699 ns | 220.473 ns | 46,325.34 ns | 1.00 | 0.01 | | TerraFXValueLazy | 100000 | 92,496.29 ns | 335.415 ns | 313.747 ns | 92,353.48 ns | 1.99 | 0.01 |

Construction Benchmark (including memory allocations) [Click To Expand]

``` ini BenchmarkDotNet=v0.13.0, OS=Windows 10.0.19042.1110 (20H2/October2020Update) Intel Core i7-4790K CPU 4.00GHz (Haswell), 1 CPU, 8 logical and 4 physical cores .NET SDK=6.0.100-preview.6.21355.2 [Host] : .NET 5.0.8 (5.0.821.31504), X64 RyuJIT DefaultJob : .NET 5.0.8 (5.0.821.31504), X64 RyuJIT ``` | Method | N | Mean | Error | StdDev | Median | Ratio | RatioSD | Gen 0 | Gen 1 | Gen 2 | Allocated | |----------------- |------ |--------------:|-------------:|-------------:|--------------:|------:|--------:|---------:|------:|------:|------------:| | **Lazy** | **1** | **20.08 ns** | **0.137 ns** | **0.121 ns** | **20.06 ns** | **1.00** | **0.00** | **0.0382** | **-** | **-** | **160 B** | | TowelSLazy | 1 | 17.17 ns | 0.077 ns | 0.072 ns | 17.15 ns | 0.86 | 0.01 | 0.0287 | - | - | 120 B | | TowelValueLazy | 1 | 10.87 ns | 0.056 ns | 0.049 ns | 10.86 ns | 0.54 | 0.00 | 0.0210 | - | - | 88 B | | TerraFXValueLazy | 1 | 17.65 ns | 0.331 ns | 0.293 ns | 17.54 ns | 0.88 | 0.01 | 0.0210 | - | - | 88 B | | | | | | | | | | | | | | | **Lazy** | **10** | **186.83 ns** | **3.736 ns** | **6.242 ns** | **185.08 ns** | **1.00** | **0.00** | **0.3309** | **-** | **-** | **1,384 B** | | TowelSLazy | 10 | 127.62 ns | 1.453 ns | 1.359 ns | 127.36 ns | 0.68 | 0.02 | 0.2351 | - | - | 984 B | | TowelValueLazy | 10 | 87.78 ns | 0.567 ns | 0.531 ns | 87.69 ns | 0.47 | 0.02 | 0.1587 | - | - | 664 B | | TerraFXValueLazy | 10 | 121.24 ns | 1.898 ns | 3.119 ns | 119.87 ns | 0.65 | 0.03 | 0.1585 | - | - | 664 B | | | | | | | | | | | | | | | **Lazy** | **100** | **1,743.06 ns** | **7.764 ns** | **6.883 ns** | **1,740.52 ns** | **1.00** | **0.00** | **3.2558** | **-** | **-** | **13,624 B** | | TowelSLazy | 100 | 1,290.09 ns | 25.756 ns | 52.027 ns | 1,275.82 ns | 0.72 | 0.02 | 2.3003 | - | - | 9,624 B | | TowelValueLazy | 100 | 832.64 ns | 16.473 ns | 20.833 ns | 823.85 ns | 0.48 | 0.01 | 1.5354 | - | - | 6,424 B | | TerraFXValueLazy | 100 | 1,259.78 ns | 25.030 ns | 61.398 ns | 1,251.76 ns | 0.71 | 0.03 | 1.5354 | - | - | 6,424 B | | | | | | | | | | | | | | | **Lazy** | **1000** | **18,528.50 ns** | **361.315 ns** | **529.610 ns** | **18,564.63 ns** | **1.00** | **0.00** | **32.5012** | **-** | **-** | **136,024 B** | | TowelSLazy | 1000 | 13,087.17 ns | 258.804 ns | 516.860 ns | 13,085.26 ns | 0.71 | 0.04 | 22.9492 | - | - | 96,024 B | | TowelValueLazy | 1000 | 7,888.33 ns | 58.363 ns | 45.566 ns | 7,885.90 ns | 0.42 | 0.01 | 15.3046 | - | - | 64,024 B | | TerraFXValueLazy | 1000 | 11,711.58 ns | 86.696 ns | 81.096 ns | 11,714.44 ns | 0.63 | 0.02 | 15.3046 | - | - | 64,024 B | | | | | | | | | | | | | | | **Lazy** | **10000** | **180,555.15 ns** | **1,490.649 ns** | **1,244.760 ns** | **180,128.83 ns** | **1.00** | **0.00** | **325.1953** | **-** | **-** | **1,360,024 B** | | TowelSLazy | 10000 | 126,633.83 ns | 2,527.836 ns | 4,870.283 ns | 124,516.07 ns | 0.71 | 0.03 | 229.4922 | - | - | 960,024 B | | TowelValueLazy | 10000 | 80,238.32 ns | 1,587.154 ns | 2,737.762 ns | 78,941.20 ns | 0.46 | 0.02 | 152.9541 | - | - | 640,024 B | | TerraFXValueLazy | 10000 | 124,471.84 ns | 2,426.863 ns | 3,239.793 ns | 124,872.33 ns | 0.69 | 0.02 | 152.9541 | - | - | 640,024 B |

Source Code For Benchmarks [Click To Expand]

```cs using System; using BenchmarkDotNet.Attributes; using BenchmarkDotNet.Running; using TowelSLazy = Towel.SLazy; using TowelValueLazy = Towel.ValueLazy; using TerraFXValueLazy = TerraFX.ValueLazy; class Program { static void Main() { BenchmarkRunner.Run(); BenchmarkRunner.Run(); BenchmarkRunner.Run(); } } public class SLazyInitializationBenchmarks { private Lazy[]? lazys; private TowelSLazy[]? towelSLazys; private TowelValueLazy[]? towelValueLazy; private TerraFXValueLazy[]? terraFXValuelazys; [Params(1, 10, 100, 1000, 10000)] public int N; internal int temp = -1; [IterationSetup] public void IterationSetup() { lazys = new Lazy[N]; for (int i = 0; i < N; i++) { lazys[i] = new(() => i); } towelSLazys = new TowelSLazy[N]; for (int i = 0; i < N; i++) { towelSLazys[i] = new(() => i); } towelValueLazy = new TowelValueLazy[N]; for (int i = 0; i < N; i++) { towelValueLazy[i] = new(() => i); } terraFXValuelazys = new TerraFXValueLazy[N]; for (int i = 0; i < N; i++) { terraFXValuelazys[i] = new(() => i); } } [GlobalCleanup] public void GlobalCleanUp() { Console.WriteLine(temp); } [Benchmark(Baseline = true)] public void Lazy() { for (int i = 0; i < N; i++) { temp = lazys![i].Value; } } [Benchmark] public void TowelSLazy() { for (int i = 0; i < N; i++) { temp = towelSLazys![i].Value; } } [Benchmark] public void TowelValueLazy() { for (int i = 0; i < N; i++) { temp = towelValueLazy![i].Value; } } [Benchmark] public void TerraFXValueLazy() { for (int i = 0; i < N; i++) { temp = terraFXValuelazys![i].Value; } } } public class SLazyCachingBenchmarks { [Params(1, 10, 100, 1000, 10000, 100000)] public int N; internal int temp = -1; [GlobalCleanup] public void GlobalCleanUp() { Console.WriteLine(temp); } [Benchmark(Baseline = true)] public void Lazy() { Lazy value = new(() => -1); for (int i = 0; i < N; i++) { temp = value.Value; } } [Benchmark] public void TowelSLazy() { TowelSLazy value = new(() => -1); for (int i = 0; i < N; i++) { temp = value.Value; } } [Benchmark] public void TowelValueLazy() { TowelValueLazy value = new(() => -1); for (int i = 0; i < N; i++) { temp = value.Value; } } [Benchmark] public void TerraFXValueLazy() { TerraFXValueLazy value = new(() => -1); for (int i = 0; i < N; i++) { temp = value.Value; } } } [MemoryDiagnoser] public class SLazyConstructionBenchmarks { [Params(1, 10, 100, 1000, 10000)] public int N; [Benchmark(Baseline = true)] public void Lazy() { for (int i = 0; i < N; i++) { _ = new Lazy(() => i); } } [Benchmark] public void TowelSLazy() { for (int i = 0; i < N; i++) { _ = new TowelSLazy(() => i); } } [Benchmark] public void TowelValueLazy() { for (int i = 0; i < N; i++) { _ = new TowelValueLazy(() => i); } } [Benchmark] public void TerraFXValueLazy() { for (int i = 0; i < N; i++) { _ = new TerraFXValueLazy(() => i); } } } ```

Conclusions

Lazy<T> Towel.SLazy<T> Towel.ValueLazy<T> TerraFX.ValueLazy<T>
Caching 3rd 2nd 1st 4th
Construction 4th 2nd 1st 2nd
Initialization ? ? ? ?
Memory 4th 3rd 1st 1st
Struct Copy Safe* n/a :heavy_check_mark: :x: :x:

* Struct Copy Safe: whether or not the lazy is safe from struct copies, meaning struct copies will not cause the factory delegate to get called multiple times. See examples below.

int i1 = 0;
System.Lazy<int> a = new(() => i1++);
System.Lazy<int> b = a; // <- not a struct copy
Console.WriteLine(a.Value); // 0
Console.WriteLine(b.Value); // 0

int i2 = 0;
Towel.SLazy<int> c = new(() => i2++);
Towel.SLazy<int> d = c; // <- struct copy
Console.WriteLine(c.Value); // 0
Console.WriteLine(d.Value); // 0

int i3 = 0;
Towel.ValueLazy<int> e = new(() => i3++);
Towel.ValueLazy<int> f = e; // <- struct copy
Console.WriteLine(e.Value); // 0
Console.WriteLine(f.Value); // 1 <- called delegate twice!!!

int i4 = 0;
TerraFX.ValueLazy<int> g = new(() => i4++);
TerraFX.ValueLazy<int> h = g; // <- struct copy
Console.WriteLine(g.Value); // 0
Console.WriteLine(h.Value); // 1 <- called delegate twice!!!

There needs to be more comparisons than just this. Please double-check my work.