help with training - Githubissues

y-x-c commented 3 years ago

Thanks for the awesome code! I am training my own model right now and have a few questions:

currently I am using 100k (out of around 1.8m) images from CelebAMask-HQ, ffhq and vggface to train the model. did you use the full set to train your model?
I didn't see large improvement for most losses anymore (160k steps trained, 4gpus x 12images/batch); is this normal? should I just continue training for more steps?
I also checked the validation results, and the reconstruction is not good.
I noticed shuffle for the training dataloader is not set to True, did you use the same setting?

Thanks!

usingcolor commented 3 years ago

Hi! You did very fast training!

Yes, I used full-set dataset. I don't know about IJB-C dataset. The distribution of dataset can influence to your model.
In the paper, they trained for 500K steps. I trained for over 500K. In my eye, your losses are getting down for attribute_loss but unstable for Rec and ID loss. In my case, the two losses are more stable and lower at the same steps.
shuffle option in training dataloader should be True. It is clearly my mistake while publishing.

y-x-c commented 3 years ago

Hi! You did very fast training!

Yes, I used full-set dataset. I don't know about IJB-C dataset. The distribution of dataset can influence to your model.

In the paper, they trained for 500K steps. I trained for over 500K. In my eye, your losses are getting down for attribute_loss but unstable for Rec and ID loss. In my case, the two losses are more stable and lower at the same steps.

shuffle option in training dataloader should be True. It is clearly my mistake while publishing.

Thanks for your reply.

I just corrected the description, I am using the same datasets (CelebAMask-HQ, ffhq and vggface) as well.
- So in your case, each step has 64 images; and let's say there are 1.5m images in those three datasets, so you trained for around 4 epochs (= 64 * 500000 / 1500000 / 5 ) in total?
- In my case, each step only has 48 images, so maybe that's why the two losses are higher at the same steps.
- I found the Rec loss is going much lower in the third epoch, and the results are much better than before. I will continue my current training and see what's going on.
Thanks for the clarification, I also changed to True during my training.

usingcolor commented 3 years ago

I trained with 32 batch size, it is the same as the paper. (Two V100 32G GPUs, 16 batch size for each)
Training GAN is very unstable. If your loss is going down, I think it works well.

Qiulin-W commented 3 years ago

Hi! You did very fast training!

Yes, I used full-set dataset. I don't know about IJB-C dataset. The distribution of dataset can influence to your model.

In the paper, they trained for 500K steps. I trained for over 500K. In my eye, your losses are getting down for attribute_loss but unstable for Rec and ID loss. In my case, the two losses are more stable and lower at the same steps.

shuffle option in training dataloader should be True. It is clearly my mistake while publishing.

Thanks for your reply.

I just corrected the description, I am using the same datasets (CelebAMask-HQ, ffhq and vggface) as well.

So in your case, each step has 64 images; and let's say there are 1.5m images in those three datasets, so you trained for around 4 epochs (= 64 * 500000 / 1500000 / 5 ) in total?

In my case, each step only has 48 images, so maybe that's why the two losses are higher at the same steps.

I found the Rec loss is going much lower in the third epoch, and the results are much better than before. I will continue my current training and see what's going on.

Thanks for the clarification, I also changed to True during my training.

Hi, did you change the coefficients of different loss terms? I found my training unstable with the coeffs provided by the author...

hanikh commented 3 years ago

I trained with 32 batch size, it is the same as the paper. (Two V100 32G GPUs, 16 batch size for each)

Training GAN is very unstable. If your loss is going down, I think it works well.

it means that you have used 'dp' instead of 'ddp'? since in 'ddp' mode the whole batch is not devided between GPUs.

payne4handsome commented 3 years ago

@y-x-c Hi, have you got the satisfying result? I trained just with FFHQ and CelebA-HQ datasets about 90 thousand images. The result is bad just like below.

princessmittens commented 3 years ago

By 4 epoch's you mean 26...

I am about two weeks into training at about the halfway mark. I noticed that some of the results on the colab show some image artifacts. Is that present in all final results? Did you manage to fix that with more training?

hanikh commented 3 years ago

By 4 epoch's you mean 26...

I am about two weeks into training at about the halfway mark. I noticed that some of the results on the colab show some image artifacts. Is that present in all final results? Did you manage to fix that with more training?

@princessmittens I am also working on this paper and I have some questions about this implementation. May I have your email address?

princessmittens commented 3 years ago

@usingcolor I am near the end of training and these are my results. I have trained on 8 gpu's with 32 gig ram and a 21 batch size/ per gpu. The results have been pretty bad so far. I tried my best to recreate the exact parameters with all 3 datasets (~1.3 million images after processing) and have trained for about 2-3 weeks. With my current batch size and according to the results, I'm at the 79% mark in reference to 500k.

@y-x-c -Have you been able to recreate better results? Is it worth continuing?

This has cost a lot of money/time Any input would be great.

@hanikh Not sure how much I can help you considering my results but my email is <>

Src: source Target: target target2 Results: testimg2 testimg

tamnguyenvan commented 3 years ago

Hello, anyone here got good result?

princessmittens commented 3 years ago

No-I have talked to @hanikh. I don't think anyone has been able to recreate the results as of yet.

antongonz commented 3 years ago

I have better results than this, princessmittens, can you leave your e-mail and I will reach out to you?

No-I have talked to @hanikh. I don't think anyone has been able to recreate the results as of yet.

tamnguyenvan commented 3 years ago

@cwalt2014 Can you please share the source code or pretrained weights? I will appreciate that. My email: tamvannguyen200795@gmail.com Thanks.

DeliaJIAMIN commented 3 years ago

@cwalt2014 Dear friend！Can you please share the source code? Thanks a million🙏🙏. My email: zhangjiajia827@gmail.com Thanks！🙏🙏🙏

princessmittens commented 3 years ago

@cwalt2014 my email is andreachristians@gmail.com

ZhiluDing commented 3 years ago

@cwalt2014 my email is dzl0418@gmail.com

Poloangelo commented 3 years ago

@cwalt2014 I would also love to know what changes you would suggest to get better results 🙏 my mail is paulchvn@gmail.com

Seanseattle commented 3 years ago

@cwalt2014 Could you please share the source code or pretrained weights? Thank you. My Email: seanzlxu@gmail.com

chinasilva commented 3 years ago

@cwalt2014 Thank you very much. My Email:476369545@qq.com

lefsiva commented 3 years ago

@cwalt2014 Thank you. My email is: lefsiva7@gmail.com

akafen commented 2 years ago

@cwalt2014 Thank you very much. My email is: 302926535@qq.com

tyrink commented 2 years ago

I have better results than this, princessmittens, can you leave your e-mail and I will reach out to you?

No-I have talked to @hanikh. I don't think anyone has been able to recreate the results as of yet.

Hi, @cwalt2014, could you please send me some of your results? I wonder what the possible results looks like. My email is 1085425753@qq.com. Any reply will be appreciated.

suzie26 commented 2 years ago

@cwalt2014 Can you share your code or pretrained weights?? Thank you soooo much!! My email is: suzieya26@gmail.com

Daisy-Zhang commented 2 years ago

@cwalt2014 Could you share your code or pretrained weights?? Thank you very much!! My email is: daisy.zdcc@gmail.com

usingcolor commented 2 years ago

@Daisy-Zhang @suzie26 @tyrink @akafen @lefsiva @chinasilva @Seanseattle @Poloangelo @ZhiluDing @princessmittens @DeliaJIAMIN @tamnguyenvan @cwalt2014 Check out HifiFace, our implementation of a more recent face-swapping model with the pre-trained model.

chuer-yu commented 2 years ago

@cwalt2014 hello, could you please share your pretrained weights? Thank you so much! My email is: chuer.yu1995@gmail.com

ywon0925 commented 2 years ago

@cwalt2014 could you please share your pre-trained weights? I would really appreciate it! My email is: 9788667@gmail.com

niuyuanc commented 1 year ago

@antonsanchez Could you please share the pretrained weights? Thank you so much 🙏🙏🙏 My email: niuyuanc@163.com Thanks🙏🙏🙏

galmizush commented 1 year ago

@cwalt2014 could you please share your pre-trained weights? I would really appreciate it! My email is: galmizush@gmail.com

Jaep0805 commented 1 year ago

@cwalt2014 could you please share your pre-trained weights? My email is: jaep0805@snu.ac.kr Thank you so much

maum-ai / faceshifter

help with training #6