Questions about training

microsoft / Deep3DFaceReconstruction

Accurate 3D Face Reconstruction with Weakly-Supervised Learning: From Single Image to Image Set (CVPRW 2019)

MIT License

2.18k stars 443 forks source link

Questions about training #129

Closed hanssssssss closed 3 years ago

hanssssssss commented 3 years ago

Hi ! I have some problems about training. Could you please tell me how big a dataset is able to produce a reasonable model ? I used 6000 faces , and the batchsize of my training is 16 , and I seted up 600 epoch to train. Is is a reasonable setting ?

Sorry for bothering you. Thanks.

YuDeng commented 3 years ago

In our experiment, we train the model with over 200k images. I think training the model with around 100k faces is reasonable. 6000 faces seems not enough.

hanssssssss commented 3 years ago

In our experiment, we train the model with over 200k images. I think training the model with around 100k faces is reasonable. 6000 faces seems not enough.

Thanks for your reply, how about the epoch? I see you set the train_maxiter = 200000, is it the epoch for the train? And I have another problems. I tried to overfit a single image, but the gamma loss was always 0 even though I add a small number (1e-5).Do you have any idea about it?

Sincerely appreciate your help.

yougusee commented 3 years ago

In our experiment, we train the model with over 200k images. I think training the model with around 100k faces is reasonable. 6000 faces seems not enough.

Thanks for your reply, how about the epoch? I see you set the train_maxiter = 200000, is it the epoch for the train? And I have another problems. I tried to overfit a single image, but the gamma loss was always 0 even though I add a small number (1e-5).Do you have any idea about it?

Sincerely appreciate your help.

Excuse me, did you meet this problem during training? FailedPreconditionError (see above for traceback): Attempting to use uninitialized value InceptionResnetV1/Repeat/block35_2/Conv2d_1x1/biases

hanssssssss commented 3 years ago

In our experiment, we train the model with over 200k images. I think training the model with around 100k faces is reasonable. 6000 faces seems not enough.

Thanks for your reply, how about the epoch? I see you set the train_maxiter = 200000, is it the epoch for the train? And I have another problems. I tried to overfit a single image, but the gamma loss was always 0 even though I add a small number (1e-5).Do you have any idea about it? Sincerely appreciate your help.

Excuse me, did you meet this problem during training? FailedPreconditionError (see above for traceback): Attempting to use uninitialized value InceptionResnetV1/Repeat/block35_2/Conv2d_1x1/biases

I used pytorch to re-implement this work. Sorry , I didn't meet the similiar problem. Maybe you didn't download the pretrained weights of the facenet.

YuDeng commented 3 years ago

In our experiment, we train the model with over 200k images. I think training the model with around 100k faces is reasonable. 6000 faces seems not enough.

Thanks for your reply, how about the epoch? I see you set the train_maxiter = 200000, is it the epoch for the train? And I have another problems. I tried to overfit a single image, but the gamma loss was always 0 even though I add a small number (1e-5).Do you have any idea about it?

Sincerely appreciate your help.

In our original experiment, we use around 260k images and train the model with a batchsize of 5 for 500k iterations. That is about 10 epochs.

I have no idea why your gamma loss is always 0. Maybe you can save the gamma output during training to see what is going wrong.

hanssssssss commented 3 years ago

In our experiment, we train the model with over 200k images. I think training the model with around 100k faces is reasonable. 6000 faces seems not enough.

Thanks for your reply, how about the epoch? I see you set the train_maxiter = 200000, is it the epoch for the train? And I have another problems. I tried to overfit a single image, but the gamma loss was always 0 even though I add a small number (1e-5).Do you have any idea about it? Sincerely appreciate your help.

In our original experiment, we use around 260k images and train the model with a batchsize of 5 for 500k iterations. That is about 10 epochs.

I have no idea why your gamma loss is always 0. Maybe you can save the gamma output during training to see what is going wrong.

Thanks for you help!!! I will try to figure what is happening with my gamma loss in my training.