Open Quasimondo opened 6 years ago
This is absolutely possible! There's a bunch of ways you could do it, depending on how fine grained you need the control to be. The more principled fine-grained ones are, unfortunately, more expensive.
The easy way to do this is to work at the objective level, weighting the objective as you define it:
def obj(T):
acts = T("mixed4a")
# Construct a horizontal gradient for ojbective weights
# (code is a little complicated so that it can respond to differnt
# image sizes, which change the shape of `acts`)
t_shape = tf.shape(acts)
W = tf.cast(tf.range(0, t_shape[1]) [None, None, :], "float32")
W /= tf.cast(t_shape[1]-1, "float32")
# objective is two different objectives, weighted by by the weights
return tf.reduce_sum(W*acts[..., 465] + (1-W)*acts[..., 476])
_ = render.render_vis(model, obj, param_f = lambda: param.image(200))
However, this approach has two downsides:
(1) It doesn't give you very fine grained spatial control (2) It doesn't account for the fact that different objectives have may gradients that are sharper or less sharp, if one is more responsive.
In theory, there's a more principled thing you can do: backprop each objective separately, normalize the gradients, and then mix them at the pixel level.
Unfortunately, this is significantly more annoying and intrinsically more computationally expensive (backprop for each objective you want to mix). :/
Well, you might be able to periodically backprop each objective and use the gradient magnitudes to dynamically re-weight the objectives. This could get you some of the benefits of the second solution without the trouble of the first.
Would it theoretically be possible to create an objective that uses either a control map or a callback function that changes the channel objective on a coordinate-based criterion? A simple example would be a vertical gradient between two channels.
I guess this would be computationally expensive, but maybe there are also reasons why this is not possible at all?