kodemore / kink

Dependency injection container made for Python
MIT License
397 stars 25 forks source link

Can't use non-lambda on-demand service creation #22

Closed alukach closed 2 years ago

alukach commented 2 years ago

Consider the following code:

from kink import di, inject, Container

def on_demand_1(di: Container):
    return "foo"

on_demand_2 = lambda di: "bar"

di['on_demand_1'] = on_demand_1
di['on_demand_2'] = on_demand_2

@inject
def test(on_demand_1, on_demand_2):
    print(f"{on_demand_1=}")
    print(f"{on_demand_2=}")

test()

I see the following output:

on_demand_1=<function on_demand_1 at 0x102a629d0>
on_demand_2='bar'

Is there any clear reason why non-lambda functions don't work for on-demand service creation? Would it be difficult to support such a pattern?

dkraczkowski commented 2 years ago

@alukach Hello, thanks for your interest in the project.

Is there any clear reason why non-lambda functions don't work for on-demand service creation? Would it be difficult to support such a pattern?

The main reason for that is the fact that in some scenarios function itself might be a dependency. I believe you can now guess why it works the way you described, and why from the library perspective it is considered a valid behavior.

A simple workaround might be wrapping your factories in a lambda function. I know it is not perfect, so if you have a better idea I am more than happy to listen to it.

alukach commented 2 years ago

Thanks @dkraczkowski!

in some scenarios function itself might be a dependency

This is indeed a valid viewpoint that I hadn't considered.

A simple workaround might be wrapping your factories in a lambda function. I know it is not perfect, so if you have a better idea I am more than happy to listen to it.

Yeah, this seems functional but may feel a bit clunky... I suppose a workaround could be to create a utility function to do this for me and to mimic how FastAPI's dependencies look, but that might be a bit unnecessary:

from kink import di, inject, Container

def Depends(func):
    return lambda di: func(di)

def on_demand_1(di: Container):
    return "foo"

on_demand_2 = lambda di: "bar"

di['on_demand_1'] = Depends(on_demand_1)
di['on_demand_2'] = on_demand_2

@inject
def test(on_demand_1, on_demand_2):
    print(f"{on_demand_1=}")
    print(f"{on_demand_2=}")

test()

which does indeed return the expected output:

on_demand_1='foo'
on_demand_2='bar'

I think what I would truly like to achieve is something more along the lines of FastAPI's dependency format, wherein a user can conveniently describe dependencies that themselves have dependencies:

from dataclasses import dataclass
import os
from typing import Callable
from kink import inject

@dataclass
class Depends:
    dependency: Callable

    def __post_init__(self):
        # At this stage we should register the dependency with the DI Container
        # so that if something describes a dependency on "Depends(self.dependency)",
        # this instance's dependency will be returned.
        ...

@inject()
class SecretProvider:
    ...

def get_secret_arn():
    return os.environ.get("SECRET_ARN")

# def get_secret(provider: SecretProvider, arn = Depends(get_secret_arn)):  # <- More like FastAPI
def get_secret(provider: SecretProvider, arn: Depends(get_secret_arn)):
    return provider.get(arn)

@inject()
# def test(secret = Depends(get_secret)):  # More like FastAPI
def test(secret: Depends(get_secret)):
    print(f"{secret=}")

test()

I think something like above could be achievable by registering each instance of Depends in the container at time of declaration. Going even further, I would love to be able to also see the trail of dependencies that are resolved for any given function (ie which dependencies in the container were actually utilized?). However, that topic may be scope creep for this ticket (I can open a new ticket to host that discussion and to share more about my intentions for using this module).

Does this fit in with the zen of this library? Is this something that this library would be interested in having, or should I keep logic like the Depends class in my own project? I'm still onboarding to both this module and DI so it's totally possible that I'm thinking about this in the wrong way.

dkraczkowski commented 2 years ago

@alukach Thanks for sharing your ideas I find them interesting, but... I would like to keep interface as simple as possible and flexible enough so everyone can use it. I have to be honest I am thinking about something like ConstrainedContainer where you declare a dataclass object which contain all your dependencies and you might initialise it (which is similar to your Depends) but I need to find time to polish the idea.

I would like to hear resoning behind the module that shows resolvance path.

alukach commented 2 years ago

Thanks @dkraczkowski. Sorry for the late reply.

I would like to hear resoning behind the module that shows resolvance path.

To share my use-case: at my work, we do a lot of data-ingestion pipelines. Typically, these are AWS Lambda functions strung together with AWS Step Functions. However, running these cloud-native pipelines locally (for dev/testing) is a bit of a pain, so a colleague (@edkeeble) and I have been playing around with the idea of putting together a framework wherein a user can declare their ingestion pipeline steps and then we can run it either locally or deploy it to the cloud.

However, pipeline steps may have external dependencies that vary based on the runtime environment. For example, maybe a pipeline step needs to load some secrets from AWS SSM when running in the cloud but should load those same secrets from a .env file when running locally. To achieve this, I have been experimenting with the idea of a framework where a user describes their pipeline step's dependencies and then at runtime, the framework would register & inject the appropriate dependencies based on the runtime environment:

https://github.com/alukach/pipeline-di-playground/blob/8f1c3169628ca7317e9c7afb8393fede9f9b2afe/example.py#L17-L34

This works by having different modules for each environment. Within each module, providers for dependencies are registered by aliasing service interfaces. For example:

This allows us to run a pipeline locally with our appropriate dependencies (https://github.com/alukach/pipeline-di-playground/blob/8f1c3169628ca7317e9c7afb8393fede9f9b2afe/playground/primitives/pipeline.py#L18-L27), or in AWS Lambdas (https://github.com/alukach/pipeline-di-playground/blob/8f1c3169628ca7317e9c7afb8393fede9f9b2afe/playground/primitives/step.py#L37-L44) where we string together the lambda executions with AWS Step Functions.

So to your question about why we want to show resolvance paths, in a cloud environment we will need to provide each step appropriate permissions to connect to backing resources. The simplest way to achieve this (in my mind), would be to infer which permissions are necessary for each step by looking at that step's dependencies. This is all very reasonable for steps that directly depend on a dependency, however for dependencies that depend on another dependency (e.g. https://github.com/alukach/pipeline-di-playground/blob/8f1c3169628ca7317e9c7afb8393fede9f9b2afe/example.py#L21-L22), we can't infer what is needed for a step without being able to follow the chain of dependencies to ensure all requirements are account for.

Sorry if that's all a bit long-winded. Does that make sense?

dkraczkowski commented 2 years ago

@alukach Hey, sorry for a late response would you mind dropping me an email (it should be in the pyproject.toml), I would like to discuss the ideas in your comment with you (If this is still relevant). Maybe we can together tinker about the solution?