pmh47 / dirt

DIRT: a fast differentiable renderer for TensorFlow
MIT License
312 stars 63 forks source link

problem about gradients #71

Closed ffiioonnaa closed 4 years ago

ffiioonnaa commented 4 years ago

Hi, I try to train a network to estimate camera extrinsic parameter [R T] with a single image by minimize the loss between mask and the rendered Silhouette of object , the R is presented by axis-angle (x,y,z). but I find that the range of the output y tends to be positive as training. Is it related to gradients? Whether the range of y values can represent the angle range of rotation around the y axis? I thought it would be between -pi~pi. I don't know much about rendering,so I wonder if you have any Suggestions.Thank you

pmh47 commented 4 years ago

As with a 'simple' angle in the plane, values wrap around so 0 and 2pi are equivalent, and there is nothing that means the model should 'prefer' the -pi...pi range. If you really want it to be in that range, use a tanh activation and multiply by pi.

In general, optimising for angles with gradient descent is difficult, as there are typically local minima to get trapped in if the 'current' prediction is significantly wrong. Directly optimising an angle-axis vector is even harder due to the singularity at the origin. I suggest you start by simplifying the problem, e.g. assume that only a y-rotation is needed, and parameterise by just a single angle, i.e. theta * [0, 1, 0] as your input to rodrigues, where theta is predicted by the neural network. If you have control of the ground-truth data, also start by training only on examples where the required y rotation is small, due to the issue of local minima I mentioned.

I'll close this as it's not an issue with DIRT in itself.

ffiioonnaa commented 4 years ago

Thank you~ and I'm a little confused that if I want the range of y-rotation to be 0-2pi,and the range of x-rotation as well as z-rotation to be 0-pi/3, how to calculate the range of x,y,z presented by angle-axis ?Is they still 0-2pi and 0~pi/3?

pmh47 commented 4 years ago

Use different matrices for the different rotations in that case, and multiply/compose them together. So something like

x_rot = tf.sigmoid(x_logit) * pi / 3
y_rot = tf.sigmoid(y_logit) * 2 * pi
z_rot = tf.sigmoid(z_logit) * pi / 3
M = dirt.matrices.compose(
    dirt.matrices.rodrigues([0, 1, 0] * y_rot),
    dirt.matrices.rodrigues([1, 0, 0] * x_rot),
    dirt.matrices.rodrigues([0, 0, 1] * z_rot)
)
ffiioonnaa commented 4 years ago

Use different matrices for the different rotations in that case, and multiply/compose them together. So something like

x_rot = tf.sigmoid(x_logit) * pi / 3
y_rot = tf.sigmoid(y_logit) * 2 * pi
z_rot = tf.sigmoid(z_logit) * pi / 3
M = dirt.matrices.compose(
    dirt.matrices.rodrigues([0, 1, 0] * y_rot),
    dirt.matrices.rodrigues([1, 0, 0] * x_rot),
    dirt.matrices.rodrigues([0, 0, 1] * z_rot)
)

Thanks~ and in this way (x_rot,y_rot,z_rot) represents Euler Angle?

pmh47 commented 4 years ago

Thanks~ and in this way (x_rot,y_rot,z_rot) represents Euler Angle?

Yes