Closed holynski closed 4 years ago
Great contribution, thank you Aleksander! I have been using the following for a while now for this exact reason.
tenNormalize = tenOutput[:, -1:, :, :]
tenNormalize[tenNormalize.abs() < 0.0000001] = 1.0
tenOutput = tenOutput[:, :-1, :, :] / tenNormalize
The only difference to yours is that I use tenNormalize.abs() < 0.0000001
whereas you use tenNormalize == 0.0
. Do you think one is preferable over the other?
I think the inequality will still have issues for accumulated metrics smaller than your defined epsilon. If the magnitude of the accumulated metric (tenNormalize
in your code) is in range [0.0000001, inf)
, your version should solve the problem. But if it's is in range (0,0.0000001)
, it's being set to 1 -- meaning you'll be dividing by something much larger than the actual accumulated metric, which prevents tenOutput
from being scaled back to its correct value.
The edge case I'm imagining is:
Input image values all equal
100
tensorMetric.exp()
values all equal1e-100
Flow is identity (all zeros)
You'd expect the identity tranformation, i.e. to get an image with values 100
. With the inequality version, you'll get (100 * 1e-100) / 1 ~= 1e-98
.
I think what we want here is just to protect against the divide-by-zero. There's a TensorFlow function for this, but I haven't seen a PyTorch equivalent: https://www.tensorflow.org/api_docs/python/tf/math/divide_no_nan
That all makes sense to me, thanks again for such a great contribution! :+1:
When dividing the accumulated output tensor by the summed weights, an epsilon is added, presumably to prevent division by zero, which happens when no input values splat to a particular destination pixel.
However, if there are splatted values to a particular destination pixel, but the metric values are small enough (say -50 as an input metric, which becomes e^-50 ~= 1e-44), the addition of
eps=0.0000001
prior to division causes precision errors, effectively clobbering the accumulated metric.With this change, the logic for preventing divide-by-zero is only applied to pixels which have zero splatted metric.
I've tested the change, it seems to produce the correct result for all metric values. There may be a minor performance hit on certain machines due to the conditional assignment on line 353, but in my tests (
B=32
,H,W=512
,C=64
-- 2080 Ti), the timing measurements actually seem to be negligibly faster with this change, not sure why. I've also experimented with the alternative of creating a mask tensor and only evaluating the quotient at non-zero pixels, but that ends up being slightly slower.