Open SimonHeybrock opened 9 months ago
Why does this need support from sciline? Can't you just put a decorator on the function itself?
This is about decorating all providers, not just individual ones. I suppose one can achieve something similar, decorating all functions (but not param access), but it needs modification in more places? Also, it requires that the decorators all use functools.wraps
, otherwise the type hints are not in place for Sciline to do its job.
providers = mylib.providers
providers = tuple(mydecorator(func) for func in providers) # breaks unless decorates uses functools.wraps
pipeline = sciline.Pipeline(providers)
pipeline.insert(mydecorator(func1))
pipeline.insert(mydecorator(func2))
as opposed to
providers = mylib.providers
pipeline = sciline.Pipeline(providers)
pipeline.insert(func1)
pipeline.insert(func2)
pipeline.decorate(mydecorator) # works with any decorator
which changes only a single line and also decorates calls to non-function providers.
Also, it requires that the decorators all use functools.wraps, otherwise the type hints are not in place for Sciline to do its job.
Sure, that's just good decorator hygiene and not really our concern.
Applying a decorator globally is a bit hard to manage for anything nontrivial. But I guess it could be useful. Does it make more sense to think of it as hooks instead of decorators? You don't so much produce new functions or classes but want to run some code when a certain event occurs (provider called). This would remove the option to modify the providers. But that seems like a good thing. Having someone rewrite a pipeline just by adding a decorator to anything makes the code much harder to reason about.
Yes, I suppose hooks might be a better description.
This would remove the option to modify the providers. But that seems like a good thing. Having someone rewrite a pipeline just by adding a decorator to anything makes the code much harder to reason about.
Not sure I understand. How and why would you rewrite a pipeline with a decorator?
Decorators can change the function you apply them to. So they can replace it with a different one. When you apply them to all providers, they can potentially change the pipeline into something completely different. It would look the same as before as far as the graph is concerned but do a different computation. Or course, that is just bad practice. But a smaller scale version of that could happen where the behaviour of one or more providers is changed. Again, bad practice. But we would invite that kind of usage by using decorators.
Concerning hooks, there are typically three types: before, after, around. Some use case can be done with before or after hooks. But timing and potentially debugging might require around hooks. We have to be careful not to turn those into decorators. I.e.
def around_hook(provider):
def impl(*args, **kwargs):
# do something
res = provider(*args, **kwargs)
# to something else
return res
return impl
would be like a decorator. But
def around_hook(provider, *args, **kwargs):
# do something
res = yield # runs the provider
# do something else
would prevent the user from changing the provider. It would also prevent them from changing the arguments and return value which may or may not be what we want.
I am not sure I share your concerns. If people want to break something they always can. If you can set a decorator you can also change the pipeline in other ways. If you use bad practices, things will break. There is nothing we can do about that, and pinning down an interface against that is hard and often counter productive. What is needed is preventing accidental misuse.
Not sure about your around_hook
. It cannot access the inputs or outputs of the provider, right? But that may be essential for logging, how would you intend to implement that?
Not sure about your around_hook. It cannot access the inputs or outputs of the provider, right? But that may be essential for logging, how would you intend to implement that?
It can access them (args
, kwargs
, res
). But it cannot modify them.
I agree with your general argument. But the normal use case for decorators is modifying functions. This can be a simple addition that doesn't change the business logic like logging. But it can do much more. Essentially, it transforms one function into another. And at least for me, that is my mental model of decorators. So I would see applying decorators to a pipeline as transforming that pipeline. So my concern is not with making it possible to modify the pipeline, but that we tell users 'Here is a tool to modify providers, knock yourselves out!'. And maybe in the docs 'By the way, don't misuse this!'
This is not necessarily bad. But how do we ensure that visualize
and serialize
are still meaningful? When the decorator uses functools.wraps
, the transformed pipeline looks exactly like the original.
I suppose we could
Not sure about your around_hook. It cannot access the inputs or outputs of the provider, right? But that may be essential for logging, how would you intend to implement that?
It can access them (
args
,kwargs
,res
). But it cannot modify them.
Do you mean it cannot modify the list or the dict, but only the values? And I suppose res
can always be modified, unless it is an immutable type?
Would you provide a Hook
base class, where to user only implement some methods such as before()
and after()
?
Do you mean it cannot modify the list or the dict, but only the values? And I suppose res can always be modified, unless it is an immutable type?
True, it's Python, so you can modify the contents of most objects.
Would you provide a Hook base class, where to user only implement some methods such as before() and after()?
No, I'd just use functions with signatures
def before_hook(provider: Provider, <args-and-kwargs-of-provider>) -> None:
def after_hook(provider: Provider, res: <Return-type-of-provider>) -> None:
def around_hook(provider: Provider, <args-and-kwargs-of-provider>) -> Generator[None, <Return-type-of-provider>, None]:
and register those with dedicated methods: Pipeline.add_{before,after,around}_hook
Before and after are straight forward functions, but around need to be a generator like this:
def around_provider(provider, args, kwargs) -> Generator[None, R, None]:
# before provider is called
res: R = yield
# after provider is called
# When calling the provider:
gen = around_provider(provider, args, kwargs)
next(gen)
res = provider.call()
try:
gen.send(res)
except StopIteration:
pass
Of course, I'm not claiming that this is the best way to do it. Can all your use cases be implemented with this?
This looks quite complicated!
around_hooks
, rather than just writing a simple decorator, which they may have done many times.Why should hooks know about the providers, a user might just want to use some existing code (e.g., for timing)?
They need to know which provider they are hooking into. Otherwise, all timing providers could only report, 'a provider ran in time x'.
Users have to learn a special syntax for the around_hooks, rather than just writing a simple decorator, which they may have done many times.
Yes, the syntax is uncommon. But apart from getting a result from yield
, it is similar to pytest fixtures. And getting a decorator right is also nontrivial. Plus, the majority of the complication is in sciline itself.
Why should hooks know about the providers, a user might just want to use some existing code (e.g., for timing)?
They need to know which provider they are hooking into. Otherwise, all timing providers could only report, 'a provider ran in time x'.
You can print the name of the function, that's that I did. But yes, having the provider may be useful.
Provider functions and parameters in a pipeline have recently been wrapped in the
Provider
class. This makes it very easy to add decorators that wrap all calls (you can try with just a few lines of changes). This is useful for a variety of purposes:I don't feel any of the above should necessarily be supported by Sciline itself. Instead, I suggest to add a method to
sciline.Pipeline
that allow adding decorators. These would be used to decorate https://github.com/scipp/sciline/blob/95ef23e93b561d476de87b129f9d0092ad83adfc/src/sciline/_provider.py#L104-L107, allowing the user to add one more of the above.I suggest making this a method of
Pipeline
, something like