microsoft / Deep3DFaceReconstruction

Accurate 3D Face Reconstruction with Weakly-Supervised Learning: From Single Image to Image Set (CVPRW 2019)
MIT License
2.16k stars 441 forks source link

Training using Binary Coefficient instead of Floating Coefficient extracted form R-Net #180

Open AsiyaNaqvi opened 2 years ago

AsiyaNaqvi commented 2 years ago

Hi, Thank you for sharing this amazing project.

I am trying to convert the extracted shape coefficient from R-Net into binary values and use these binary value to reconstruct the face shape. For this purpose I am using a Straight Through Estimator (STE) and used the following code:

@tf.custom_gradient
def quantize(x):
    x=tf.cast(tf.greater(x, 0, name="fc-binary"),tf.float32,name="fc-binaryfloat")
    def grad(dy):
        return tf.tanh(dy)
    return x, grad

I am only trying to convert shape coefficients into binary while rest of the coefficient remains same. I called this function in networks.py and passed the extracted net_id to quantize function.

When I started training, in initial iterations the model seems to be training fine as you can see in the picture: 13

But as I train it further the results are so bad as you can see:

32

NOTE: This model is not training from the scratch but I used per-trained weights (that is trained on CELEBA data-set) to initialize the model and then further train it on my custom data-set.

If you have any idea regarding what might be causing this problem and can guide me. I will be really grateful.

YuDeng commented 2 years ago

I think you might need to adjust the regularization weight for the shape coefficients. You may check the distribution of learned shape coefficients, if their values are too large compare to the initialized ones, you have to increase the regularization weight for them in order to get reasonable shapes.

AsiyaNaqvi commented 2 years ago

Thank you for the response.

I tried this by adding regularization on id coefficient in regulation loss function.

regulation_loss = w_shape * tf.nn.l2_loss(id_coeff) + w_ex * tf.nn.l2_loss(ex_coeff) + w_tex * tf.nn.l2_loss(tex_coeff)

I tried different values for w_shape and I also tried to increase the w_id that is used for perceptual loss but still the results aren't getting any better.