container-interop / service-provider

[EXPERIMENTAL] Promoting container/framework interoperability through standard service providers
71 stars 10 forks source link

Altering/Composing service providers #41

Closed moufmouf closed 9 months ago

moufmouf commented 7 years ago

Hey guys,

I have a feeling that we are missing a useful interface to "alter" the list of services provided by service providers. This is only a feeling so far and I have nothing to show yet, but I thought I might share it with you here.

Here is what triggered this thinking:

Let's say you have an application that needs to log in 2 different places. Maybe you need a logger that logs in the PHP error log, and another logger that logs in a GELF Graylog server. Those 2 loggers are provided by 2 service providers. Right now, the service providers would probably look like this:

class ErrorLogLoggerServiceProvider implements ServiceProvider {
    public function getServices()
    {
        return [
            // We assume ErrorLogLogger is a PSR-3 compliant logger that logs in PHP error log
            LoggerInterface::class => function() { return new ErrorLogLogger(); }
        ]
    }
}
class GraylogLoggerServiceProvider implements ServiceProvider {
    public function getServices()
    {
        return [
            // We assume ErrorLogLogger is a PSR-3 compliant logger that logs in PHP error log
            LoggerInterface::class => function(ContainerInterface $container) {
                // We assume GelfLogger is a PSR-3 compliant logger that logs in Graylog
                return new GelfLogger($container->get('graylog.host'));
            }
        ]
    }
}

At this point, you can see that we have a problem. If we register the 2 service providers to our container, the second logger will override the first logger, so we will only log in one logger.

What we would like is to have the ability to add a third CompositeLogger logger that would be able to call both loggers. We could actually already do it with the current proposal. All we have to do is put all the loggers in a "loggers" service and have this "loggers" service be used by the composite logger:

class ErrorLogLoggerServiceProvider implements ServiceProvider {
    public function getExtensions()
    {
        return [
            "loggers" => function(ContainerInterface $container, array $loggers = []) {
        $loggers = new ErrorLogLogger();
                return $loggers;
        }
        ]
    }
}
class GraylogLoggerServiceProvider implements ServiceProvider {
    public function getServices()
    {
        return [
            "loggers" => function(ContainerInterface $container, array $loggers = []) {
        $loggers = new GelfLogger($container->get('graylog.host'))
                return $loggers;
        }
        ]
    }
}
class CompositeLoggerServiceProvider implements ServiceProvider {
    public function getServices()
    {
        return [
            LoggerInterface::class => function(ContainerInterface $container) {
                return new CompositeLogger($container->get('loggers'));
            }
        ]
    }
}

This does actually work. But there is a big assumption here. We are assuming that the ErrorLogLoggerServiceProvider and GraylogLoggerServiceProvider know they must write in the "loggers" service. So basically, they must have been designed with the assumption that some day, they could be used in a composite logger. In practice however, it is far from likely that service provider authors will be able to anticipate this kind of use case.

So the question I'm asking is this: assuming that we have a bunch of service providers that are providing services in the LoggerInterface::class entry, how could we easily add another service provider that has the ability to compose those?

The first idea that came to my mind was that we could compose service providers together.

Something like this:

$serviceProviders[] = new CompositeLoggerServiceProvider(
    new ErrorLogLoggerServiceProvider();
    new GraylogLoggerServiceProvider();
);
class CompositeLoggerServiceProvider implements ServiceProvider {
    private $serviceProviders;

    public function __construct(array $serviceProviders) {
        $this->serviceProviders = $serviceProviders;
    }

    public function getServices()
    {
        $composedFactories = [];
        $loggerFactories = [];

        foreach ($this->serviceProviders as $serviceProvider) {
            $factories = $serviceProvider->getServices();

            if (isset($factories[LoggerInterface::class])) {
                $loggerFactories[] = $factories[LoggerInterface::class];
                unset($factories[LoggerInterface::class]);
            }

            $composedFactories += $factories;
        }

        $composedFactories[LoggerInterface::class] = function(ContainerInterface $container) use ($loggerFactories) {
            return new CompositeLogger(array_map(function($loggerFactory) use ($container) { return $loggerFactory($container) }, $loggerFactories))
        };

        return $composedFactories;
    }
}

Ok, it hurts a little, but what this service provider does is actually to merge several service providers into one unique service provider. All services are merged classically, expect LoggerInterface::class entries that are actually composed in a CompositeLogger. The example is not complete since I'm not dealing with merging extended services correctly.

This would technically work, but there are a few drawbacks:

So what I'm looking for is the ability to simply write:

$serviceProviders[
    new ErrorLogLoggerServiceProvider(),
    new GraylogLoggerServiceProvider(),
    new CompositeLoggerServiceProvider()
);

And somehow, automagically, the CompositeLoggerServiceProvider could look up into the other declared service providers and modify their behaviour. This is somehow akin to the notion of compiler pass in Symfony: the ability for a service provider to access and modify the services declared by other service providers.

Simply, I don't really know what this new interface should look like.

Something like:

interface ServiceProcessor
{
    /**
     * Classes implementing this interface are given a chance to alter the result of the getServices method call.
     * ???
     */
    public function alterServices(ServiceProvider $serviceProvider) : array
}

I'm not completely sure this is the right thing to do. I guess I need some help to push the idea further. I also have a feeling this could be a great way to deal with "tagging" of services.

Any of you guys this problem needs to be solved? If yes, any idea how we could tackle it?

shochdoerfer commented 7 years ago

Looks like this is a problem when the $identifier is not unique. Maybe it is not a good idea to rely on interface names as $identifiers then. But for sure we need a mechanism to rename $identifiers if they clash or as you tried to re-compose dependencies out of the configuration.

What I learned to love with the Disco configuration is using traits to extract partial configurations (e.g. configurations for modules/plugins...). Since traits give you the power to change method names or specify which method to use, this is really powerful (and comes for free). Maybe that could be a solution for our problems ;)

XedinUnknown commented 6 years ago

Perhaps, this is not a problem that should be solved in this domain. I agree that interface/class names should not be used as service identifiers. In reality, there is nothing that guarantees (at least for now) that a certain service will implement a certain interface, so there is no point in doing that IMHO. I think that services are meant to override one another by specifying values for the same "keys", and this is part of the mechanics. The simplest way to do this is to approach the application from a modular point of view, and have a modular system in place which controls the order of module setup/loading (for example, I'm drafting a standard at dhii/modular-interface). This way, modules loaded later (perhaps by specifying that they depend on another module) can override definitions that were added earlier. The application may choose to use Puli to compose a list of possible definitions, and then resolve to a specific definition using some mechanism. Or, the application may not care about overriding at all. In either case, this is probably implementation detail.

pmall commented 6 years ago

Very interesting subject @moufmouf

To be honest I think the current ->getExtensions() method a bit fishy. It implies the provider of some implementation know how to behave when there is other (unknown) implementations provided. The very fact the developer of a package have to think about this and to anticipate it is not satisfying.

Here you suggest to reverse the responsibility of registering extensions. This is the provider actually providing an extensible implementation which is in charge to register the others. I think this is what should happen. What about removing completely the ->getExtensions() method if this goal is achieved ?

I'm currently thinking about some map - reduce approach. What if we consider the mapping part the merging of all factories returned by all the ->getFactories() methods and the reducing part to be optional callbacks provided by ->getReducers() methods. They would be used to reduce many factories methods for the same identifier down to one factory. When no reducer is provided for an identifier, just default to the last registered factory as always.

pmall commented 6 years ago

@moufmouf I extensively use service provider interface for my personal projects. I think they are awesome. Thinking about this issue, I guess many problems can be solved by simply using a ->getMap(): array method (and by the way removing the ->getExtensions() one so proliferation of methods is avoided).

My assumption is no service provider should provide factories associated to interface names. Especially if we want to autoload them. One project may use many loggers, many caches, etc, so no service provider should fill this namespace. Instead it should be able to tell what factory it provides for the implementation of an interface. Lets take your original issue as an example:

<?php

class FileLoggerServiceProvider implements ServiceProviderInterface
{
    public function getMap()
    {
        return [
            LoggerInterface::class => [
                FileLogger::class,
            ],
        ];
    }

    public function getFactories()
    {
        return [
            FileLogger::class => function ($container) {

                return new FileLogger(...);

            }
        ];
    }
}
<?php

class GraylogLoggerServiceProvider implements ServiceProviderInterface
{
    public function getMap()
    {
        return [
            LoggerInterface::class => [
                GraylogLogger::class,
            ],
        ];
    }

    public function getFactories()
    {
        return [
            GraylogLogger::class => function ($container) {

                return new GraylogLogger(...);

            }
        ];
    }
}
<?php

class CompositeLoggerServiceProvider implements ServiceProviderInterface
{
    public function getMap()
    {
        return [];
    }

    public function getFactories()
    {
        return [
            // Here is the trick: the container pass the whole merged map as
            // an optional second parameter to the factory.
            CompositeLogger::class => function ($container, array $map = []) {

                // You can do whatever you want with the awesome map.
                $loggers = array_map([$container, 'get'], $map[LoggerInterface::class] ?? []);

                // fail if the loggers array is empty. Or fail in the composite logger. It is
                // up to the package creator.

                return new CompositeLogger($loggers);

            }
        ];
    }
}

I see many upsides to this simple approach:

Also lets think about middleware service providers adding middleware implementations to the map. How cool it would be to autoload them all into one request dispatcher, using the map ;)

pmall commented 6 years ago

Hi @moufmouf I read your last blog post and this part immediately caught my attention:

The idea is that service providers should respect some kind of convention.

If you are writing a service provider for Monolog, the service creating the Monolog\Logger class should be named Monolog\Logger. This will allow containers using auto-wiring to automatically find the service.

Additionally, you can create an alias for your service on the Psr\Log\LoggerInterface, if you want to auto-wire the LoggerInterface to the Monolog\Logger service.

My ideas were'nt really clear in my last post in this thread, but I still think if this notion of alias is standardized it can solve the problem you are describing.

What about having a getAliasesmethod?

<?php

class ErrorLogLoggerServiceProvider implements ServiceProviderInterface
{
    public function getAliases()
    {
        return [
            LoggerInterface::class => ErrorLogLogger::class,
        ];
    }

    public function getFactories()
    {
        return [
            ErrorLogLogger::class => function () {...},
        ]
    }
}

Then a container could easily collect all those aliases, resulting in an aliases map: [LoggerInterface::class => [ErrorLogLogger::class, GelfLogger::class]]. It can also easily add a generic factory proxying the alias to the service provider factories, so the process is transparent.

The point is this map could be passed to all factories as optional second parameter.

This way, "composed" service providers could expose a list of factories for "composed" services which would eventually override the regular service factories when needed:

<?php

class CompositeLoggerServiceProvider implements ServiceProviderInterface
{
    // didn't come up with a nicer name yet.
    public function getComposedServices()
    {
        return [
            // could override ids or add new ids
            LoggerInterface::class => function ($container, array $aliases = []) {

                $loggers = array_map([$container, 'get'], $aliases[LoggerInterface::class] ?? []);

                return new CompositeLogger($loggers);

            },
        ];
    }
}

I'm a bit disappointed to define yet another method but it allows the container to register composed service factories after regular service factories, so the order service provider are registered does not matter.

It adds a bit of work to the container but I think it is tolerable. It should:

Then factories are called with the aliases map as optional second parameter when a service is retrieved.

Anyway design details are irrelevant at this point. The main idea is to let service providers define a map of aliases to its actual factories. It can be used to inspect what services are exposed by a service provider and it can be used by all factories in order to create composed services. The aliases can also be generic __toString() classes containing metadata about the provided service: shortname, priority, etc...

moufmouf commented 6 years ago

Hey Pierre,

This is... actually a very powerful idea!

Of course, like you, I'm not a huge fan of adding another method to the interface, but it does indeed help solve a real problem!

I think I need more time than I currently have to make my mind on this, but I do like the way you are headed. It's way clearer than the ideas I did present at the beginning of this issue.

:+1:

pmall commented 6 years ago

@moufmouf thanks for the feedback :+1:

Some more troughs about this:

Of course, like you, I'm not a huge fan of adding another method to the interface, but it does indeed help solve a real problem!

This example is really annoying me because this method serve only one purpose: when you need to override an alias (LoggerInterface::class in this case). I'm trying to find a way so no special method is required. In other cases of a service using multiple implementations of an alias it is quite cool:

class DispatcherServiceProvider implements ServiceProviderInterface
{
    // Cool: no need to use a special method here to take advantage of the aliases
    public function getFactories()
    {
        return [
            Some\Dispatcher::class => [$this, 'getDispatcher'],
        ];
    }

    public function getDispatcher($container, array $aliases)
    {
        $middleware = array_map([$container, 'get'], $aliases[MiddlewareInterface::class] ?? []);

        // order the middleware somehow.

        return new Dispatcher($middleware);
    }
}

Also I think this approach should not allow the factory ids to be interfaces, forcing the service provider creator to register interface names as aliases. The container could make each alias point to the last factory associated with it. It is a bit silly for MiddlewareInterface, but it would work the same way as it is working now for LoggerInterface. Have to think about this.

Also, lets say I have a session service depending on a CacheItemPoolInterface for storing session data. My app could have it aliased to both a file implementation and a redis implementation. How could I choose which one I feed my session service with? Obviously there is no way for the session service to know in advance which one the developer want to use. A well designed session service could try to use a cache implementation tagged with 'session' and default to the first one available when there's none. The appropriate tag name to use would be specified in the session service documentation. This way it would work out of the box with any cache pool implementation then it could be fine tuned. For this to work there should be way for the developer to tag a specific implementation of an alias. No clear idea about this right now. Maybe adding a ->taggedWith() method to ServiceProviderInterface which would somehow add tags to all the aliases it provides before being consumed by the container?

<?php

$providers = [
    new FileCacheServiceProvider,
    (new RedisCacheServiceProvider)->taggedWith(['my.session']), // could use multiple tags
    new MySessionServiceProvider, // use redis cache as it is tagged appropriately by the developer
];

It can also be a decorator:

<?php

$providers = [
    new FileCacheServiceProvider,
    new TaggedServiceProvider(new RedisCacheServiceProvider, ['my.session']),
    new MySessionServiceProvider,
];

Also the aliases could be an object instead of an array. Containers supporting universal service providers could have an object implementing this interface. It could allow to use nice methods:

<?php

public function getDispatcher($container, ServiceMapInterface $map)
{
    // return all the middleware aliases.
    $middlewareAliases = $map->all(MiddlewareInterface::class);`

    $middleware = array_map([$container, 'get'], $middlewareAliases);

    // etc...

}
public function getSession($container, ServiceMapInterface $map)
{
    // return the cache implementation tagged with 'session', or default to the first one.
    $cacheAlias = $map->first(CacheItemPoolInterface::class, 'session');`

    $cache = $container->get($cacheAlias);

   // etc...

}
moufmouf commented 6 years ago

I'm not really confortable with the taggedWith approach yet (it's a bit difficult to understand I think).

But thinking along the lines of your first proposal.

I'm trying to find a way so no special method is required.

Thinking about it, what is really powerful in your idea is the second argument to the factory containing the aliases.

What if, instead of defining aliases, the container was keeping an array of the factories that have been overloaded?

class CompositeLoggerServiceProvider implements ServiceProviderInterface
{
    public function getFactories()
    {
        return [
            // We assume that all logger service providers will create an entry on LoggerInterface::class
            LoggerInterface::class => [$this, 'getCompositeLogger'],
        ];
    }

    public function getCompositeLogger($container, array $overloadedLoggers)
    {
        $allLoggers = [];
        foreach ($overloadedLoggers as $overloadedLogger) {
            $allLoggers[] = $overloadedLogger($container);
        }
        return new CompositeLogger($allLoggers);
    }
}

This would work great, with one caveat: the CompositeLoggerServiceProvider must be registered after all other service providers providing loggers (so the order of the service providers registration gains some importance)

What do you think?

pmall commented 6 years ago

Hi,

If we assume the problem to solve is how to overload entries, yes this solution is very elegant. It could even be simpler, the container could give to the factory a callable returning an array of overloaded instances:

class CompositeLoggerServiceProvider implements ServiceProviderInterface
{
    public function getFactories()
    {
        return [
            LoggerInterface::class => [$this, 'getCompositeLogger'],
        ];
    }

    public function getCompositeLogger($container, callable $getOverloadedLoggers)
    {
        $loggers = $getOverloadedLoggers($container);

        return new CompositeLogger($loggers);
    }
}

Looks like the former $getPrevious of the extensions :)

But imo I think this is not a very common situation. Maybe I'm wrong but I can't figure out many actual cases of an objects composing many other objects with the same interface. By using ServiceProviderInterface I encountered many use cases I'd like to have a solution for:

What I think is pretty with aliases is the aliased instances are cached by the container. When passing other factories to a factory, a new instance is created every time they are used. I through aliases could be used to solve bigger problems but you're also right, the notion of aliases is not very clear and it is hard to implement. Maybe a notion of namespace? Or a compiler pass?

Maybe I'm just taking it the wrong way or maybe it is just out of the scope of the PSR. I would be interested to know what is the exact scope you expect for this PSR then we could find solutions more efficiently.

moufmouf commented 6 years ago

Yes, it's kind of obvious we need to decide first the scope of the PSR. I'll work on that and do a proposal. This way, we can compare each idea we have with the agreed upon scope.

pmall commented 6 years ago

Hi @moufmouf,

I keep trying to come up with something allowing to define how service providers could interact. What do you think about this:

<?php

class FileCacheServiceProvider implements ServiceProviderInterface
{
    public function configure(CompilerInterface $compiler)
    {
        // FileCache can be aliased by many entries, including CacheInterface.
        return $compiler->alias(FileCache::class, [CacheInterface::class]);
    }

    public function getFactories()
    {
        return [
            FileCache::class => [self::class, 'getCache'],
        ];
    }

    // ...
}
<?php

class RedisCacheServiceProvider implements ServiceProviderInterface
{
    public function configure(CompilerInterface $compiler)
    {
        // RedisCache can be aliased by many entries, including CacheInterface.
        return $compiler->alias(RedisCache::class, [CacheInterface::class]);
    }

    public function getFactories()
    {
        return [
            RedisCache::class => [self::class, 'getCache'],
        ];
    }

    // ...
}
<?php

class WeatherApiServiceProvider implements ServiceProviderInterface
{
    public function configure(CompilerInterface $compiler)
    {
        // Here we define this service provider requires an implementation of
        // CacheInterface registered in the 'weatherapi.cache' entry of the container.
        // Also this would be a good place to register validations.
        return $compiler->requires(CacheInterface::class, 'weatherapi.cache');
    }

    public function getFactories()
    {
        return [
            WeatherApi::class => [self::class, 'getWeatherApi'],
        ];
    }

    public static function getWeatherApi($container): WeatherApi
    {
        // Here we make use of the local alias for the cache implementation.
        $cache = $container->get('weatherapi.cache');

        return new WeatherApi($cache);
    }

    // ...
}

Then in the project:

<?php

// Lets have an higer order object containing all the app service providers.
// The keys could be inferred from the package name and vendor.
$providers = new ServiceProviderCollection([
    'cache.file' => new FileCacheServiceProvider,
    'cache.redis' => new RedisCacheServiceProvider,
    'moufmouf.weatherapi' => new WeatherApiServiceProvider,
]);

// Lets retrieve the factories.
$factories = $providers->getFactories();

// We expect the same result as with the current specification:
$factories === [
    // As redis cache is registered the last, it is aliased by CacheInterface:
    CacheInterface::class => new Alias(RedisCache::class),
    FileCache::class => [FileCacheServiceProvider::class, 'getCache'],
    RedisCache::class => [RedisCacheServiceProvider::class, 'getCache'],
    // By default the requirement is an alias to the interface.
    // It ensures it aliases the default implementation by default.
    'weatherapi.cache' => new Alias(CacheInterface::class),
    WeatherApi::class => [WeatherApiServiceProvider::class, 'getWeatherApi'],
];

// But now we can do this:
// This basically means add another "pass" resolving the requirements only on
// this subset of service providers, before retruning the factories.
$providers->bind(['cache.file', 'moufmouf.weatherapi']);

$factories = $providers->getFactories();

// Here we expect:
$factories === [
    // No change about this:
    CacheInterface::class => new Alias(RedisCache::class),
    FileCache::class => [FileCacheServiceProvider::class, 'getCache'],
    RedisCache::class => [RedisCacheServiceProvider::class, 'getCache'],
    // Here the second pass override the 'weatherapi.cache' entry:
    'weatherapi.cache' => new Alias(FileCache::class),
    WeatherApi::class => [WeatherApiServiceProvider::class, 'getWeatherApi'],
];

// We can also deal with tags:
// This basically means add another "pass" adding a new entry for all the same
// name aliases provided by this subset of service providers.
$providers->tag('tagged.caches', ['cache.file', 'cache.reddis']);

$factories = $providers->getFactories();

// Here we expect:
$factories === [
    // No change about this:
    CacheInterface::class => new Alias(RedisCache::class),
    FileCache::class => [FileCacheServiceProvider::class, 'getCache'],
    RedisCache::class => [RedisCacheServiceProvider::class, 'getCache'],
    'weatherapi.cache' => new Alias(FileCache::class),
    WeatherApi::class => [WeatherApiServiceProvider::class, 'getWeatherApi'],
    // A new entry returning an array of cache implementations:
    'CacheInterface[tagged.caches]' => new ArrayAlias([FileCache::class, RedisCache::class]),
];

The upsides of this approach:

pmall commented 6 years ago

I made a working example of all this.

To sum this up:

Let service providers expose aliases to the entries they provides and requirements they need. This can be done either by using two methods getAliases and getRequirements or one configure() method updating the state of a config object:

I used the two methods approach in the example.

Once we have collected aliases and requirements of a group of service providers (a group can be all the application service providers or a subset of all service providers), it is easy to apply some kind of "compiler pass" returning factories from those aliases and requirements. Multiple compiler passes can be applied by the project owner to orchestrate how its service providers interact together. Each compiler pass result is merged with the previous one, adding or overriding factories. Whats cool with this is any developer can create a custom compiler pass for his project if he needs a specific way to fulfill its service providers requirements!

mindplay-dk commented 9 months ago

To the logger example, personally, if I was adding, say, a Graylog provider to my application, I wouldn't expect this provider to occupy the LoggerInterface ID - I would in fact consider that bad practice. I would expect any logger provider to bootstrap it's own facilities, in it's own namespace, and that I would still need to alias or bootstrap the use of that logger myself.

Either way, aliases seem like a workable solution to the original problem statement - the current proposal supports aliases.

I don't think further discussion is required here - if you feel differently, please feel free to open a thread in Discussions.