kalekundert / byoc

MIT License
0 stars 0 forks source link

Make separate callbacks for `set` and `load` #7

Closed kalekundert closed 3 years ago

kalekundert commented 3 years ago

The issue is that these two events are just different enough that they require different responses:

If I do make a callback specifically for setting properties, I had an idea to support a syntax like this:

class Obj:

    def _set_x(self, x):
        self.x = x
        self.y = x + 1

    x = appcli.param(set=_set_x)

Note that _set_x() itself assigns to x, which seems at first like it would trigger infinite recursion. However, this can be prevented:

class param:
    ...
    def __set__(self, obj, x):
        state = self._load_state()

        if not self._set or state.inside_set_callback:
            state.setattr_value = x
            return

        else:
            state.inside_set_callback = True
            try:
                self._set(obj, x)
            finally:
              state.inside_set_callback = False
    ...

The nice thing about this syntax is that it makes it possible to set the attribute, then to subsequently call some other function that might make use of the attribute value. The other syntax I had in mind, which was to have the set callback return the processed value, would not allow this.

That said, you would still need to worry about infinite recursion in a situation like this (unless I were to temporarily disable set callbacks for the whole object, which just seems like a bad idea):

class Obj:

    def _set_x(self, x):
        self.x = x
        self.y = x + 1

    def _set_y(self, y):
        self.y = y
        self.x = y - 1

    x = appcli.param(set=_set_x)
    y = appcli.param(set=_set_y)
kalekundert commented 3 years ago

Another idea is to get rid of the set/load callbacks and to add the idea of "param_groups". Some syntax examples are shown below. This is basically the "do everything on get access" idea from before (see #4), but with the caching handled behind the scenes (and built on top of the existing parameter caching functionality):

class Obj:
    x = appcli.param()
    y = appcli.param()

    # - This would be called on __get__ to cache values.
    # - What if both `x` and `y` were set between calls to get?  I think I'd 
    #   need to write something more like what I have below in order for this 
    #   to work.
    @param_group(x, y)
    def _calc_xy(self, result):
        if self.x:
            result.y = self.x + 1
        if self.y:
            result.x = self.y - 1
class Obj:
    _x = appcli.param()
    _y = appcli.param()
    _z = appcli.param()

    # This will be called and cached on attribute access.
    # The cache would be invalidated on load (via `cache_version`) and on __set__.
    @param_group(x, y, z)
    def _calc_xyz(self):
        if not self._x:
            self._x = ...
        if not self._y:
            self._y = ...
        if not self._z:
            self._z = ...

    def get_x(self):
        return self._x

    def get_y(self):
        return self._y

    def get_z(self):
        return self._z

    def set_xy(self, x, y):
        self._x = x
        self._y = y
        self._z = None

    def set_yz(self, y, z):
        self._x = None
        self._y = y
        self._z = z

    def set_zx(self, z, x):
        self._x = x
        self._y = None
        self._z = z

Pros:

Cons:

kalekundert commented 3 years ago

Wait, can I do the above with just the get callbacks that are already implemented? No, because those values aren't cached (to be consistent with property). But maybe it would be easier to add a cached get? Or make it so that get is only called on every access for dynamic attributes?

Actually, it's easy to implement caching directly in the getters; just assign to the attribute if it doesn't have a value:

class Obj:

    @appcli.param()
    def _x(self, x):
        if not x:
            self._x = ...
        return x

    @appcli.param()
    def _y(self, y):
        if not y:
            self._y = ...
        return y

    @appcli.param()
    def _z(self, z):
        if not z:
            self._z = ...
        return z

    def get_x(self):
        return self._x

    def get_y(self):
        return self._y

    def get_z(self):
        return self._z

    def set_xy(self, x, y):
        self._x = x
        self._y = y
        self._z = None

    def set_yz(self, y, z):
        self._x = None
        self._y = y
        self._z = z

    def set_zx(self, z, x):
        self._x = x
        self._y = None
        self._z = z

I could even provide a cache() wrapper to make this even easier.

Actually, it's even easier to just implement caching in the property getters:

class Obj:

    _x = appcli.param()
    _y = appcli.param()
    _z = appcli.param()

    def get_x(self):
        if not self._x:
            self._x = ...
        return self._x

    def get_y(self):
        if not self._y:
            self._y = ...
        return self._y

    def get_z(self):
        if not self._z:
            self._z = ...
        return self._z

    def set_xy(self, x, y):
        self._x = x
        self._y = y
        self._z = None

    def set_yz(self, y, z):
        self._x = None
        self._y = y
        self._z = z

    def set_zx(self, z, x):
        self._x = x
        self._y = None
        self._z = z

A drawback with this approach is that it isn't easy to check for config errors serpartely from python errors. What I mean by that is that python can enforce that all of the required information is provided, but this may need to be checked when parsing user input. Adding a check into the code above would run the check on every assignmnt, which is superfluous.

kalekundert commented 3 years ago

Another alternative is to write the getter and setter properties just like usual, and to manually call the setters after each load. The idea here is that, since you'll have to write getter/setter properties anyways (see above for why these are unavoidable in the general case for groups of dependent params), it'd be nice to be able to write them like you normally would. Then you just need an on_load() callback to get everything working before any setters are called.

class Obj:
    _x = appcli.param()
    _y = appcli.param()
    _z = appcli.param()

    @appcli.on_load(DocoptConfig)  # config_cls argument is optional
    def _calc_xyz(self):
        if not self._x:
            self.set_yz(self._y, self._z)
        if not self._y:
            self.set_zx(self._z, self._x)
        if not self._z:
            self.set_xy(self._x, self._y)

    def get_x(self):
        return self._x

    def get_y(self):
        return self._y

    def get_z(self):
        return self._z

    def set_xy(self, x, y):
        self._x = x
        self._y = y
        self._z = ...

    def set_yz(self, y, z):
        self._x = ...
        self._y = y
        self._z = z

    def set_zx(self, z, x):
        self._x = x
        self._y = ...
        self._z = z

With this approach, I would need to keep in mind that the on_load() callback could be called when all of the params already have values and nothing actually needs to change. This might require a bit of defensive programming.

The apps in stepwise_mol_bio wouldn't even really need this callback, because they have a load() function that could be overridden to the same effect. But if this is the route I go, there should be a decorator like the one above in appcli.

This goes back to the idea from #4 of having per-config callbacks (but with nicer syntax). I ended up implementing per-param callbacks, but I was only thinking about the case where one y is dependent on x, but x isn't dependent on y. When the dependency graph is circular (which it almost always is), per-params callbacks are not a viable solution. In contrast, I think per-config callbacks would be a pretty powerful tool for resolving interdependent parameters. The only downside I can think of so far is that they might get called more than necessary, but this is easy to deal with and not surprising.

kalekundert commented 3 years ago

For now I'm just going to remove the idea of set callbacks. If I decide to add them back later, it will only be for use with __set__, I think.