DevTeam / Pure.DI

Pure DI for .NET
MIT License
410 stars 21 forks source link

Enumerable injection issue #52

Closed ekalchev closed 2 months ago

ekalchev commented 2 months ago

I am injecting IEnumerable< IService >. My expectation is to receive IEnumerable with two already created intances but I got instead of instances, factories. So basically the behavior is like I am injecting IEnumerable< Func < IService >>.

Is this by design?


    .Bind<IService>(1).To(ctx =>
    {
        return new StagingService();
    })
    .Bind<IService>(2).To(ctx =>
    {

        return new RegistrationService();
    })

    class ServiceHost
    {
        ServiceHost(IEnumerable<IService> services)
        {
            foreach (IService service in services) // each iteration will create new instance of IService
            {

            }

            foreach (IService service in services) // iterating again, does not reuse instances but create a new ones
            {

            }
        }
    }
NikolayPianikov commented 2 months ago
using Pure.DI;

var composition = new Composition();
var host = composition.Host;

DI.Setup(nameof(Composition))
    .Bind(1).To<StagingService>()
    .Bind(2).To<RegistrationService>()
    .Root<ServiceHost>("Host");

interface IService;

class StagingService : IService
{
    public StagingService() => Console.WriteLine(GetType());
}

class RegistrationService : IService
{
    public RegistrationService() => Console.WriteLine(GetType());
}

class ServiceHost
{
    public ServiceHost(IEnumerable<IService> services)
    {
        foreach (var service in services)
        {
        }

        foreach (var service in services)
        {
        }
    }
}

The result is this:

StagingService
RegistrationService
StagingService
RegistrationService

IEnumerable assumes lazy object resolution. That is, object resolution will occur each time on each new iteration. You have a lot of options.

First, you can "materialize" services into a collection before iterating, such as services.ToList():

class ServiceHost
{
    public ServiceHost(IEnumerable<IService> services)
    {
        var serviceList = services.ToList();
        foreach (var service in serviceList)
        {
        }

        foreach (var service in serviceList)
        {
        }
    }
}

The result is this:

StagingService
RegistrationService

Second you can inject the collection right away:

class ServiceHost
{
    public ServiceHost(IReadOnlyCollection<IService> services)
    {
        foreach (var service in services)
        {
        }

        foreach (var service in services)
        {
        }
    }
}
NikolayPianikov commented 2 months ago

@ekalchev To understand how it works, you can just explore the generated code:

public ServiceHost Host
{
  get
  {
    IEnumerable<IService> EnumerationOf_perBlockIEnumerable()
    {
      yield return new StagingService();
      yield return new RegistrationService();
    }

    IEnumerable<IService> perBlockIEnumerableM05D16di1 = EnumerationOf_perBlockIEnumerable();
    return new ServiceHost(perBlockIEnumerableM05D16di1);
  }
}
ekalchev commented 2 months ago

The current behavior regarding the injection of IEnumerable seems to differ from the typical handling in other DI containers, where an IEnumerable would be expected to contain a set of instances rather than factories. For cases where lazy initialization is desired, utilizing IEnumerable<Lazy > or IEnumerable<Func > could provide clearer intent and align with common DI patterns.

Consider a scenario where a developer is working with a subset of a codebase that utilizes IEnumerable but isn’t aware that Pure.DI is being used. Without this context, it might not be immediately apparent that IEnumerable is yielding new instances on each iteration, which could lead to unexpected behavior. It’s important for such nuances to be intuitively understood.

Additionally, with that behavior you must be aware if a Enumerable instance is supplied by a DI container (and not as argument) so you must call .ToArray() to ensure you can iterate safely this collection.

NikolayPianikov commented 2 months ago

@ekalchev I agree that this behavior is different from classic DI containers. Pure.DI is not a container but a code generator that creates compositions of objects as if you were writing code by hand. Right now it is like this and I think this is the right approach to use IEnumerable<T>:

public ServiceHost Host
{
  get
  {
    IEnumerable<IService> CreateEnumerable()
    {
      yield return new StagingService();
      yield return new RegistrationService();
    }

    return new ServiceHost(CreateEnumerable());
  }
}

And I'm sure it's the optimal code - that's how I would write it manually :)

MS DI does not work correctly in terms of "laziness" of IEnumerable<T>:

using Microsoft.Extensions.DependencyInjection;

var serviceProvider = new ServiceCollection()
    .AddTransient<IService, StagingService>()
    .AddTransient<IService, RegistrationService>()
    .AddTransient<ServiceHost>()
    .BuildServiceProvider();

serviceProvider.GetRequiredService<ServiceHost>();

interface IService;

class StagingService : IService
{
    public StagingService() => Console.WriteLine(GetType());
}

class RegistrationService : IService
{
    public RegistrationService() => Console.WriteLine(GetType());
}

class ServiceHost
{
    public ServiceHost(IEnumerable<IService> services)
    {
        Console.WriteLine(GetType());

        foreach (var service in services)
        {
        }        
    }
}

The result is this:

StagingService
RegistrationService
ServiceHost

But it should be like this:

ServiceHost
StagingService
RegistrationService

What if I don't iterate over services at all, or I iterate depending on the condition or I only need the first element? Same thing with repeated iterations. Classic DI containers don't work correctly in terms of IEnumerable<T> being "lazy". They create objects that are not needed and at the wrong time. They store them in memory so that they can be returned again in the next iterations. But you can always change this behavior and make it similar to classic DI containers, for example, like this:

using Pure.DI;

var composition = new Composition();
var host = composition.Host;

// This setup works globally for all setups in this project
DI.Setup("", CompositionKind.Global)
    .Bind<IEnumerable<TT>>().To(ctx =>
    {
        ctx.Inject(out IReadOnlyCollection<TT> items);
        return items;
    });

DI.Setup(nameof(Composition))
    .Bind(1).To<StagingService>()
    .Bind(2).To<RegistrationService>()
    .Root<ServiceHost>("Host");

interface IService;

class StagingService : IService
{
    public StagingService() => Console.WriteLine(GetType());
}

class RegistrationService : IService
{
    public RegistrationService() => Console.WriteLine(GetType());
}

class ServiceHost
{
    public ServiceHost(IEnumerable<IService> services)
    {
        Console.WriteLine(GetType());
        foreach (var service in services)
        {
        }

        foreach (var service in services)
        {
        }
    }
}

Now works not right from my point of view :), but as in classic DI containers:

StagingService
RegistrationService
ServiceHost

If you want to override the behavior for your setup only:

DI.Setup(nameof(Composition))
    .Bind<IEnumerable<TT>>().To(ctx =>
    {
        ctx.Inject(out IReadOnlyCollection<TT> items);
        return items;
    })
    .Bind(1).To<StagingService>()
    .Bind(2).To<RegistrationService>()
    .Root<ServiceHost>("Host");
ekalchev commented 2 months ago

Classic DI containers don't work correctly in terms of IEnumerable being "lazy". They create objects that are not needed and at the wrong time

What in IEnumerable imply it should do anything lazy? As I said if you want laziness you should work with IEnumerable < Func > or Lazy< T >. This way you correctly communicate with the readers of the code what would happen when iterating. In the example with IEnumerable< Func< T > > you get a factory. So you have laziness.

Classic containers behavior is not because they can't do. They don't want to do what you did here. It is just wrong. If I want to make the instantiation lazy I could inject Func<IEnumerable< T > > that won't create the instances until I need them. But you clearly see that behavior in the type you are injecting. If I inject IEnumerable - I explicitly ask non-lazy instantiation of all items T and I want them instantiated before calling the class constructor that accept that argument.

Lets say you work with 3rd party library that is not aware what DI container you use. Passing IEnumerable constructed by Pure.DI will not work at all. What code expect the behavior that you current have for IEnumerable?

I understand that you might want factory like behavior but that should not be hidden behind IEnumerable.

I am not asking for change or something, just trying to point out this approach is fundamentally wrong. The code base should not be aware if it is run in the context of source generated DI or classic container.

NikolayPianikov commented 2 months ago

If I use IEnumerable<T> I assume that each new element will be received/processed only when it is iterated. That's how the whole LINQ works, and that's how yield return and the state machines that are created for it work, and that's how most of the API works, e.g. Directory.EnumerateFiles(), File.ReadLines(), etc.

Every time you read lines using this API you will read them from a file instead of memory:

var lines = File.ReadLines("abc.txt");
foreach (var line in lines)
{
  // There will only be one line from the file in memory at any given time!
}

foreach (var line in lines) // CA1851: Possible multiple enumerations
{
}

Imagine that you have a file of size N GB. If you want to read the whole file, you use File.ReadAllLines("abc.txt") and the result is not IEnumerable<T>.

There is even a code analysis check that will return CA1851: Possible multiple enumerations of IEnumerable collection if you try to iterate over IEnumerable<T> several times and there can be side effects because of this.

To get a collection and iterate over it multiple times I use IReadOnlyCollection<T>. It implements IEnumerable<T> and the Count property, which assumes that all elements are already present.

I'm not sure if Func<IEnumerable<T>>> is needed in IEnumerable<T>, since every enumeration already creates an IEnumerator<T>> at every enumeration.

I agree that this behavior is different from classic DI containers, but in my opinion it is correct. If you want the behavior that works there now, it is better to use IReadOnlyCollection<T>. But in any case you can change it:

.Bind<IEnumerable<TT>>().To(ctx =>
{
    ctx.Inject(out IReadOnlyCollection<TT> items);
    return items;
})
ekalchev commented 2 months ago

Thanks for pointing out how to workaround this issue.