mentalisttraceur / python-compose-operator

BSD Zero Clause License
1 stars 0 forks source link

(Re-)Design Ramblings #1

Closed mentalisttraceur closed 2 years ago

mentalisttraceur commented 2 years ago

Creating this issue to log thoughts/changes to design since my last comments on https://github.com/mentalisttraceur/python-compose/pull/1 .

mentalisttraceur commented 2 years ago

Probably not going to repeat ideas that are already recorded in the git commit messages.

But the one really big one so far is that:

Now composable | always takes precedence over type union |. So composable.force is removed. But if you want to decorate a class definition with composable, you should use composable_constructor instead, which will instead always let type union | take precedence over its own |. And then you just use composable on a class definition in a ... | ... expression to force composition.

The whole type union ambiguity was always a balancing act of many different factors - I'd rather give the best tools for developers at their best, and a developer at their best will proactively critically think and question before copying from examples, and thus can be trusted to use the right one of composable and composable_constructor for any decorator use-cases. (Also I didn't like overlaying .force on the wrapped object, but that alone could've been solved by just moving the force method back out into its own top-level function like force_composable.)

mentalisttraceur commented 2 years ago

So I already mentioned in the commit messages that I was dropping a lot of the automatic unwrapping behavior, and several reasons for it.

But I wanted to jot down an elaboration on one of those reasons:

I had a prolonged struggle with wrapper composition and unwrapping - how those two work or don't work together.

We saw the first taste of this when trying to figure out what to do about compose versions 1.1.0 to 1.4.7 using function.__wrapped__ to flatten. Basically, what we really wanted in that case was to distinguish between implementation detail "this is what I am wrapping" and semantic "this is what I am wrapping". But that was muddled at the time because in that case, compose instances are only wrappers in some narrow sense, not in the much more universal sense that wrapt object proxies are.

But as I've thought about it more, and in particular as I was in the thick of the futile battle where I was trying to get the unwrapping features in this library working, I realized that this is a much bigger problem:

It matters any time we are composing multiple wrappers, and in particular when one wrapper is implemented as two or more wrappers. In those cases, the wrapper itself and low-level implementation stuff such as all the magic method forwards want a ._reference_to_the_very_next_layer_in_the_wrapper_onion, but external consumers of the wrapper want ._reference_to_the_abstractly_next_layer_in_the_wrapper_onion. __wrapped__ tries to be both, but they're inherently mutually incompatible.

And this can't be fixed "from the outside", because every usage of .__wrapped__ is currently inherently ambiguous - the only way to fix it is to consciously decide at each .__wrapped__ access if you want to dig into or step over the possible implementation detail of the wrapper being implemented as a composition of wrappers.

Maybe at some point I'll realize some way in which wrappers written to just use .__wrapped__ could be composed together into something that, to the external user, looks like a single wrapper, where .__wrapped__ goes all the way through it. But for now that seems like there are a few options, each of which have downsides:

  1. use an attribute like ._internal_wrapped for the implementation detail next wrapper object - downside: need to rewrite every single passthrough method of the wrapper class to use that attribute instead of .__wrapped__.
  2. use an attribute like ._wrapped or .wrapped for the semantic higher-level wrapped object - downside: it's a leaky abstraction since all existing code does through .__wrapped__.
  3. use something like a weakref.WeakKeyDictionary, saving the outer wrapper class of the composition as a key, and the object wrapped by the whole composition as a value - downside: overhead and complexity of weak references and external linked state.
mentalisttraceur commented 2 years ago

Painful: the interplay of composable, composable_constructor, and composable_instances.

I may need to just go back to the earlier design where there was one composable which did the whole defer-when-both-operands-are-types thing, and maybe a separate force_composable function to make it easy for users to work around that when necessary.

Because the interplay between three different wrapper classes, which can be arbitrarily nested around one object, is really unwieldy.

mentalisttraceur commented 2 years ago

Side note: I just realized that if g @ f is overloaded to give you composable(compose(g, f)), then it has a neat parallel with decorator syntax, where all three of these are equivalent:

Chained decorators:

@g
@f
def whatever(...): ...

Composition:

def whatever(...): ...

whatever = (g@f)(whatever)

Composition as decorator:

@g@f
def whatever(...): ...

(Prior to PEP-614, decorators couldn't be arbitrary expressions, so that last one only works on Python >=3.9.)

Basically, as a happy accident, @ for composition and chained @ for decoration end up doing essentially the same thing. Of course @ for decoration is limited syntactically to just before def, async def, and class statements, and does something very different than composition.

This is either a very good thing in favor of @ for composition, or a very bad thing against @ composition, depending on how you look at it. I am hesitant to call it a bad thing because I just discovered it and it seems like there might be benefits from this that I'm not realizing, and it's a neat way to get some stylistic freedom in decorator chaining (even though I personally am inclined against using it, would probably push back on it in a code review, because there are visual and mental benefits to human parsing when decorators are one-per-line). But I think it's probably bad... The benefit is seemingly superficial, but the downside is a lot of potential for conflation and confusion.

I can already see new devs getting confused by seeing @foo @bar in code and thinking that's just generally something you can do, not realizing that it only works because someone slapped @composable on the definition of foo or bar.

mentalisttraceur commented 2 years ago

I guess the other perspective on that last point is that @ for composition would so rarely in the real world overlap with @ for decoration that sources and opportunities for confusion wouldn't come up.

mentalisttraceur commented 2 years ago

Still, I think @ is probably ruled out as a candidate for right-then-left composition for me.

Which was a back-of-mind question, both in general (if we had an operator overload in Python for right-then-left composition, what should it be?) and specifically for this library (should this library provide that as well?)

I had already settled on | as the best choice for left-then-right composition, because it has a syntactical similarity to shell pipelines, which in turn have a fundamental similarity with function composition (one way to see this is to consider an_iterable | partial(map, f) - another is to see everything that comes out on a program's standard output as its return value, and the fact that it's streamed as just an implementation detail).

But for right-then-left composition I saw three reasonable Python operator candidates: +, *, and @. And now I feel like I can comfortably rule out @, because there is a good reason. Confusion with decorator syntax is a very minor problem (after all, | has confusion potential with type unions), but it's a problem the other two alternatives don't have.

mentalisttraceur commented 2 years ago

Here's a tangent vaguely relevant to operator overloads for functional programming:

from types import MappingProxyType as _MappingProxyType

class Arguments:
    __slots__ = ('_args', '_kwargs') 
    def __init__(self, /, *args, **kwargs):
        self._args = args
        self._kwargs = _MappingProxyType(kwargs)

    @property
    def args(self):
        return self._args

    @property
    def kwargs(self):
        return self._kwargs

Often when looking at stuff like decorators, various functional programming shenanigans, serializing calls across RPC boundaries, operator overloads for functional programming, and so on, I have wanted an object that can represent all arguments to a function.

The obvious usage is something like this:

# in one part of the code
arguments = Arguments(a, b, c, foo=d, bar=e, qux=f)

# in another part of the code
my_function(*arguments.args, **arguments.kwargs)

(Incidentally, functools.partial is already almost this kind of Arguments object, but it forces you to specify a function at the time that you know the arguments, and functools.partial(functools.partial, dummy_function) is even closer, with the nuisance of inconsistent attribute naming - args and keywords instead of either arguments and keywords or args and kwargs.)

We can also easily imagine

I think this also reveals that Python could benefit from a triple-splat operator: ***arguments to expand an object which represents both positional and keyword arguments. A natural idea would be that arguments would need to be an iterable of two things. That way any (args, kwargs) tuple could be splatted with a *** and in fact the above Arguments class could maybe even just be a subclass of namedtuple.

Anyway, having said all that, I'd like to bring this back to operator overloads for functional programming, such as function composition:

An operator overload for applying arguments would be cool. And besides the decorator conflation risk, one other reason against using @ for composition is that it kinda looks right to me as an operator overload for partial application. And it's very intuitive and tempting to do something like this:

my_function@(foo, bar)  # partial(my_function, foo, bar)
my_function@{"foo": 0, "bar": 1)  # partial(my_function, foo=0, bar=1)

And of course it's immediately obvious that some nice way to specify keyword arguments, or both kinds of arguments at one, would be nice. Naturally we can alias kw = dict or even def kw(**kwargs): return Arguments(**kwargs) and then:

my_function@kw(foo=1, bar=2)

But one big problem in Python with any overloading where tuple syntax is tempting is that (foo) is not a tuple, it's just foo. Normal people don't remember than they must do (foo,). This was the thing that made %-formatting for Python strings bad, or at least the only thing I can remember.

So I think if we learn from the past, then we obviously don't want any operator overload that relies on tuples unless it absolutely requires only tuples (because otherwise you will get single objects that were intended as one thing but look like tuples or dictionaries or something else that you interpret as multiple things).

And that leads us back to an Arguments object. Because if you do something like a = Arguments, then you can do:

my_function@a(foo, bar, qux=0)

And that's the kind of conciseness that I think a lot of people looking for operator overloads for functional programming would love. I mean I could see myself appreciating it, especially if all these operator overload ideas were bundled into one:

(f | g)@a(foo, bar, qux=1, alp=2)

etc.

(On the flip side, this is starting to turn into a great demonstration of why operator overloads are worse for reading/understanding - when you see partial(compose(g, f), foo, bar, qux=1) you just have to know the concepts of function composition and partial application - when you see (f | g)@a(foo, bar, qux=1) you have to also know the mapping between operators to those concepts.)

Anyway, one way to clean some of this up is that when a good currying implementation available, f @ arg could be overloaded as curry(f)(arg), which can then be written f@(f) and then you start being able to chain it as f@(foo)(bar)(qux=1)(alp=2) (except your first argument can't be a kwarg).

Another angle to take is a unary operator overload: for example, ~f to mean partial(partial, f), which then lets you do (~f)(foo, bar, qux=1).

I don't really like any of this personally, I don't like code turning into symbol soup. But I think it's worth thinking through.

mentalisttraceur commented 2 years ago

Anyway, back on target: I am growing rather tired of all this wrapper and operator overload stuff. I don't think I'm really the guy for the job. I don't have as much passion or taste for operator overloads in Python for functional programming.

I mainly just have to passion and taste for getting it implemented as right as possible, if it's going to be implemented at all.

So I may just have to call it quits, in which case I'll probably delete this repo and PyPI project (leave the space open for someone else to come in and take over if they want).

mentalisttraceur commented 2 years ago

Now that I've taken a few days of break, I'll try to give it one more day. If I don't have a great solution to the nested wrapping (and operator precedence when that happens) issues which I feel really great about, or if I don't feel good about having an ongoing commitment of some kind to a library that provides this functionality, I'll probably wipe this out.

(It's not that I can't come up with a great solution, it's that my life priorities and what I actually want to work on doesn't justify the amount of time and effort and care this pitfall-ridden and frequently-too-bike-shedding-y design space is.)

mentalisttraceur commented 2 years ago

The most annoying thing about collapsing/flattening redundant wrappers is that a composition like composable(Foo(composable_constructor(...))) can't know:

  1. does Foo rely on the thing it is wrapping to have exactly the interface of composable_constructor?
  2. does Foo mask or change the interface of of composable_constructor?
  3. (if the answers to the first two questions are no, then this one also becomes potentially useful) is there any way to safely shallow-copy Foo? (Edit: eh, I guess you could just try a copy, and do general except Exception on it, since there's no standard or norm which we can rely on classes to follow for what exceptions will be thrown by classes which don't implement copying - ideally it's just TypeError, but then people do things like NotImplementedError, and some people might convince themselves yet other exceptions make sense.... but perhaps an except (TypeError, NotImplementedError) is a fine middle ground) (Edit 2: this is actually easier if we don't worry about arbitrary transparent wrappers, but instead only worry about our classes (and any subclasses) - then we can just require as part of subclasses fulfilling the class interface that if copying is unsupported, it raises a specific error - and that copying actually copies the wrapper, not the this inside the wrapper).

The second most annoying thing about collapsing/flattening wrappers is that because of these issues, even the best possible collapsing/flattening implementation doesn't actually solve any of the operator overload problems.

Yet another annoying thing about collapsing/flattening wrappers is that even if Foo is a subclass of one of our composable* classes, we can only presume to replace it in the simple "cast"/conversion construction case. Basically composable(Foo(...)) can be justifiably flattened to composable(...) if Foo is a subclass of composable or composable_constructor but not if it is a subclass of composable_instances. Ditto vice-versa for composable_instances(Foo(...)).

The final annoying thing about collapsing/flattening wrappers is that it's still very tempting to me. It's a nice-to-have feature because it makes reprs way nicer, removes the amount of needless intermediate calls through wrapper implementations of __call__, __or__, and so on....

mentalisttraceur commented 2 years ago

Aside: One more nuisance of using wrapt.CallableObjectProxy - gotta manually implement __copy__ and __deepcopy__ (or at the very least, explicitly set them to None in the class body, to suppress the explicit always-error-raising overrides the object proxy base class does, and recover the default copy behavior).

mentalisttraceur commented 2 years ago

Another hassle with the interaction between collapsing/flattening wrappers and copying, is that we can achieve a nicer wrapper flattening only if __copy__ returns an instance of the wrapper, rather than an instance of the inner wrapped thing.

But I guess we can just repeat an isinstance test after a copy to rule out the case where it doesn't.

mentalisttraceur commented 2 years ago

Current untested draft of the best wrapper flattening we an do:

from copy import copy as _copy

def _copy_wrapper(obj, wrapper_classes):
    try:
        obj = _copy(obj)
    except TypeError:
        return None
    if not isinstance(obj, wrapper_classes):
        return None
    return obj

def _reduce_wrappers(obj, replaceable_classes, bypassable_classes):
    outer_wrapper = None
    outer_wrapper_copy = None
    if isinstance(obj, bypassable_classes):
        outer_wrapper = obj
        obj = obj.__wrapped__
    if isinstance(obj, replaceable_classes):
        outer_wrapper_copy = _copy_wrapper(outer_wrapper, bypassable_classes)
        obj = obj.__wrapped__
    if outer_wrapper_copy is not None:
        try:
            outer_wrapper_copy.__wrapped__ = obj
        except AttributeError:
            return outer_wrapper
        obj = outer_wrapper_copy
    return obj

And then for example composable.__init__ would do:

        function = _reduce_wrappers(
            function,
            replaceable_classes=(composable, composable_constructor),
            bypassable_classes=composable_instances,
        )

I'm not really sure if the complexity and overhead is justified. It felt way more clearly justified in compose than it does here. In compose, flattening costs a couple very simple and straightforward code lines, and the benefit is obvious.

I guess the only other further improvements you could do to the above are:

  1. Loop the reduce-wrappers logic as long as the-object-unwrapped-so-far is still an instance of either bypassable or replaceable classes. But that seems to clearly go over the "not worth it" threshold, especially because that loop addition would only help if redundant nestings already formed, but the above logic should prevent redundant nestings in the first place.
  2. Allow users to somehow register/declare/mark additional wrapper classes as orthogonal with (transparent to and not depending on) the composable* classes, so that they can participate in this flattening/bypassing. (If this is done, then it makes way more sense to have reduce-wrappers loop.)

Anyway, I'm going to set aside the wrapper flattening until after the operator overload is fixed.

mentalisttraceur commented 2 years ago

The solution to the operator overload precedence mess turned out to come real easy thanks to the conclusion I reached about four days ago that I'd have to scan down the wrapper layers to see if the nearest one was composable or composable_instance, instead of just doing an isinstance(other, composable) check. Now instead I do _is_forced_composable(other):

def _is_forced_composable(obj):
    composable_classes = (composable, composable_constructor)
    obj = _unwrap(obj, stop=lambda obj: isinstance(obj, composable_classes))
    return isinstance(obj, composable)

And that works great, but with that out of the way I've uncovered yet another, at first glance at least as annoying problem....

mentalisttraceur commented 2 years ago

In short, the next problem is: composable_instances(SomeClass) | composable(some_function) will compose as compose(some_function, SomeClass) instead of compose(some_function, composable_instances(SomeClass), which is a very wrong loss of composable_instances from the composition!

An easy way to see this is to replace some_function with identity: compose(identity, SomeClass) is equivalent to just SomeClass, which is clearly functionally different than composable_instances(SomeClass).

Anyway, the cause is that composable_instances.__or__ performs self.__wrapped | other, which is why the composable_instances disappears.

The initial obvious and tempting knee-jerk reach to go overload composable_instances.__or__ is clumsy make-it-workism - what's the semantic justification and coherency with the rest of the design for a class decorator whose sole purpose is to make instances composable to also overload any operators on the class itself? This is what made this issue super-annoying at first glance.

But I think there's a reasonably acceptable justification: remember that composable_instances(Foo) is basically compose(composable, Foo). In fact, if if decorator transparency didn't matter, compose(composable, Foo) would be The Correct Solution, and it would be right to not provide any special composable_instances decorator at all. So! What's compose(..., compose(composable, Foo), ...)? Well, obviously it's compose(..., composable, Foo, ...). In other words, there is a natural, conceptually coherent and well-fitting overload for composition operatoor on composable_instances. Basically, composable_instances(Foo) | f should really expand to Foo | composable | f, but with the type-union protecting checks of composable_constructor.

And that seems pretty satisfactory and elegant.

mentalisttraceur commented 2 years ago

As a practical modification of the conclusion from the last post, it's in some ways nicer to have the composition still "look like" compose(..., composable_instances(Foo), ...), both in the repr and for programmatic introspection.

So it ends up being nicer to just reuse the | overloads from composable_constructor on composable_instances, rather than making slightly different ones which go out of their way to expand to composable and self.__wrapped__ as two separate composition items.

I think that's still conceptually/semantically clean, coherent, and justified: because just because composable_instances(Foo) is in some sense just a special form of compose(composable, Foo), that doesn't mean that it's better to have composition lose the additional information composable_instances(Foo) implies beyond compose(composable, Foo).

mentalisttraceur commented 2 years ago

Fiddling with the wrapper flattening some more:

def _reduce_wrappers(obj, removable_classes, bypassable_classes):
    if isinstance(obj, bypassable_classes):
        inner = obj.__wrapped__
        if isinstance(inner, removable_classes):
            try:
                outer_copy = _copy(obj)
            except TypeError:
                return obj
            if not isinstance(outer_copy, bypassable_classes):
                return obj
            try:
                outer_copy.__wrapped__ = inner.__wrapped__
            except AttributeError:
                return obj
            return outer_copy
        return obj
    if isinstance(obj, removable_classes):
        return obj.__wrapped__
    return obj
mentalisttraceur commented 2 years ago

I've decided to not pursue the more extensive flattening logic in this library.

In fact, I think I'm going to reduce the flattening even further, to just the simple case that is basically universally sensible for wrappers: Foo(Foo(...)) flattens itself down to Foo(...).

In particular, composable(composable_constructor(...)) and composable_constructor(composable(...)) will no longer flatten.

Because

  1. the cleverer the flattening, the more likely it is that a repr looks surprisingly or misleadingly different than what a developer expects,

  2. the more complicated and involved the flattening, the less of a strict win it is for performance (flattening pays more at wrapper construction time to execute all the operations on the wrapper faster - but the performance gain from flattening in these cases is at best just removing very straightforward, extremely optimization-friendly pass-through no-effect methods).

  3. For function composition operator overload providing wrappers, the "call" overheads (for example calling .__or__ through the wrapper) are in the big picture still construction overheads (code calls them as part of syntax sugar to construct a composable function, not as part of calling that function) and the call overheads come later, either after composition (by which time the whole thing has been flattened into a composable(sacompose(...))), or entirely bypassing any composition (such as when someone calls or operates on a function, class, or callable object that has been decorated by composable or composable_constructor, which is a situation that in normal usage shouldn't have any wrappers for us to flatten anyway).

TL;DR: _reduce_wrappers is zero benefit in the normal case, tiny dubious benefit or maybe even net harm in a subset of unlikely edge cases.

mentalisttraceur commented 2 years ago

Alright, I think this design is stable and polished enough to take up to a v1 release, unless I find any other flaws or challenges while cleaning up the tests or writing the docs.

Couple of ideas for later:

  1. an uncomposable wrapper/decorator, which always takes precedence over composable and composable_constructor, to take care of cases such as:

    • if you have a callable object that's already has | overloaded, and it's wrapped with one of the composable wrappers, you can still "pierce down" to its | overload, and
    • if you have a an expression using |, and you're not sure if the operands are wrapped with composable, you can make sure composable doesn't "contaminate" the expression.
  2. Revisit requiring wrapt as a mandatory dependency - in principle we could try the wrapt import and gracefully degrade to a much less transparent wrapper class if it raises an ImportError.

  3. Maybe provide some more opportunities for plugging your own classes in - composable could use a different compose or offer composition on another operator like +, composable_{constructor,instances} could use a different composable (such as one that offers other operators or uses another composition function), and composable_instances could even in principle be usefully expanded to decorate metaclasses by having every class come out decorated with composable_{constructor,instances}.

  4. Type-hinting! I originally wanted to get this in the v1 release but I might not have have the time or the energy right now.

Type-checking -enabling annotations are intended as soon as I get to it, the rest are on maybe-once-we-run-into-a-use-case standby.