Closed Cwj1212 closed 2 years ago
Thanks for your answer. I did a little experimentation, on the CUB dataset, "sim = sim/temp, itd=5, itc=10"without modifying any other hyperparameters, did get the results in your paper(Although the training is not completely over, it is close to the paper results. FID=11.16, 2000kimg).
I am a beginner in deep learning. My question may seem unprofessional, thank you very much for answering my question.
By the way, please update your final results of CUB here after you finishing training, so other people can know what the results will be under that setting. Thanks a lot.
I think you don't need to run 25000 kimgs on CUB, because CUB has less than 9000 images.
I experimented with this setting of "CUB dataset, sim=sim/temp ,itd=5, itc=10, 2200 kimg" and got FID=10.89(Continued training does not lead to continued improvement in FID), which is similar to the result of 10.48 in the paper. If I find out later in my experiments that I can get better results with a certain hyperparameter setting, I'll report back in this issue.Thanks again for your work and answer.
I experimented with this setting of "CUB dataset, sim=sim/temp ,itd=5, itc=10, 2200 kimg" and got FID=10.89(Continued training does not lead to continued improvement in FID), which is similar to the result of 10.48 in the paper. If I find out later in my experiments that I can get better results with a certain hyperparameter setting, I'll report back in this issue.Thanks again for your work and answer.
Which inception model do you use, pre-trained on imagenet or fine-tuned on CUB?
I experimented with this setting of "CUB dataset, sim=sim/temp ,itd=5, itc=10, 2200 kimg" and got FID=10.89(Continued training does not lead to continued improvement in FID), which is similar to the result of 10.48 in the paper. If I find out later in my experiments that I can get better results with a certain hyperparameter setting, I'll report back in this issue.Thanks again for your work and answer.
Which inception model do you use, pre-trained on imagenet or fine-tuned on CUB?
inception model pre-trained on imagenet is not usually directly used on CUB ,such as attnGAN, stackGAN,DF-GAN。 Hence, I'm wondering if the CUB result in the paper is not appropriate? anyway, this work is great and impressive
Yes, I think StackGAN, AttnGAN, DM-GAN, DF-GAN used the fine-tuned inception model to calculate the IS.
To calculate FID, DF-GAN and DM-GAN used the pre-trained model directly from torchvision.models.inception_v3. No FID is reported in StackGAN and AttnGAN. So I think FID results are OK.
As for IS results, unfortunately, I don't have access to the original fine-tuned model used in StackGAN. And I think fine-tuning a model by myself may not lead to fair results. Considering others and future works may want to compare with our method, I choose to use the pre-trained inception model, so everyone can quickly get results under a fair comparison.
But it will be interesting to see the IS results with the original fine-tuned inception model from StackGAN. I hope someone who has that fine-tuned model on their machine can test it later.
kGAN, AttnGAN, DM-GAN, DF-GAN used the fine-tuned inception model to ca
The fine-tuned model is available in DF-GAN,DM-GAN. Note that they used different inception models on CUB and COCO. The fine-tuned model is first provided by stack GAN. According to my experience, the IS of this work will not be more than 4.5 (tested on fine-tuned inception model). I highly recommend the authors update this metric in the paper.
DM-GAN and DF-GAN did not upload the fine-tuned inception model... they provided links, which basically leads to https://github.com/hanzhanggit/StackGAN-inception-model. However, the model is on longer available there. If you have the fine-tuned model, can you send it to me, or upload it to google drive?
Can you explain and elaborate more on "According to my experience, the IS of this work will not be more than 4.5"? I see that DF-GAN and DM-GAN have already obtained IS of 4.75 and 5.10 with the fine-tuned model. According to your experience, what convinces you that that IS of this work will be worse than theirs and less than 4.5? Thanks.
kGAN, AttnGAN, DM-GAN, DF-GAN used the fine-tuned inception model to ca
The fine-tuned model is available in DF-GAN,DM-GAN. Note that they used different inception models on CUB and COCO. The fine-tuned model is first provided by stack GAN. According to my experience, the IS of this work will not be more than 4.5 (tested on fine-tuned inception model). I highly recommend the authors update this metric in the paper.
DM-GAN and DF-GAN did not upload the fine-tuned inception model... they provided links, which basically leads to https://github.com/hanzhanggit/StackGAN-inception-model. However, the model is on longer available there. If you have the fine-tuned model, can you send it to me, or upload it to google drive?
Can you explain and elaborate more on "According to my experience, the IS of this work will not be more than 4.5"? I see that DF-GAN and DM-GAN have already obtained IS of 4.75 and 5.10 with the fine-tuned model. According to your experience, what convinces you that that IS of this work will be worse than theirs and less than 4.5? Thanks.
Sorry, I'm not familiar with text-to-image, I just guess that based on subjective judgement. After my experiments, the results accord with the paper well. Thank you for your great work and careful reply. I apologize for my crude comments. Best wishes! sincerely.
kGAN, AttnGAN, DM-GAN, DF-GAN used the fine-tuned inception model to ca
The fine-tuned model is available in DF-GAN,DM-GAN. Note that they used different inception models on CUB and COCO. The fine-tuned model is first provided by stack GAN. According to my experience, the IS of this work will not be more than 4.5 (tested on fine-tuned inception model). I highly recommend the authors update this metric in the paper.
DM-GAN and DF-GAN did not upload the fine-tuned inception model... they provided links, which basically leads to https://github.com/hanzhanggit/StackGAN-inception-model. However, the model is on longer available there. If you have the fine-tuned model, can you send it to me, or upload it to google drive? Can you explain and elaborate more on "According to my experience, the IS of this work will not be more than 4.5"? I see that DF-GAN and DM-GAN have already obtained IS of 4.75 and 5.10 with the fine-tuned model. According to your experience, what convinces you that that IS of this work will be worse than theirs and less than 4.5? Thanks.
Sorry, I'm not familiar with text-to-image, I just guess that based on subjective judgement. After my experiments, the results accord with the paper well. Thank you for your great work and careful reply. I apologize for my crude comments. Best wishes! sincerely.
It's OK, all discussions are welcome here :)
I'm having some doubts while reading your code. I know this part of the code is from stylegan, but if you know about it , I hope it can answer my doubts. The code uses DDP for distributed training. It is divided into multiple rounds in a batch, and Gradient Accumulation is used to speed up.
These two doubts are not related to your thesis itself, but are caused by my shallow knowledge, thank you very much for answering it. As a beginner, I don't know if I express my doubts clearly
I'm having some doubts while reading your code. I know this part of the code is from stylegan, but if you know about it , I hope it can answer my doubts. The code uses DDP for distributed training. It is divided into multiple rounds in a batch, and Gradient Accumulation is used to speed up.
- If batch size=64, gpu num=4, batch_gpu=8, then round=2. Therefore, when calculating the contrast loss, multiple GPUs in a round are gathered to obtain negative samples for contrast, then the number of negative samples is 32, not the 64(batch size).Is that so?
For Gradient Accumulation, stylegan only puts forward process in model.no_sync, but not backward(). Is such Gradient Accumulation effective? https://github.com/drboog/Lafite/blob/a79c66a407dd7996052b6c7c9d77a338380506b4/training/loss.py#L81-L82
https://github.com/drboog/Lafite/blob/a79c66a407dd7996052b6c7c9d77a338380506b4/training/loss.py#L268
I thought that all reduce operations of gradient synchronization will be carried out in backward, but backward() is not in model.no_sunc, will it prevent unnecessary all reduce at this time?
These two doubts are not related to your thesis itself, but are caused by my shallow knowledge, thank you very much for answering it. As a beginner, I don't know if I express my doubts clearly
In my implementation, I manually set "batch size = 16*gpus" (round will be 1), and the contrastive loss is computed per GPU, i.e. using 16 samples instead of 64. If you want to calculate the contrastive loss across GPUs using 64 samples, you can add "--gather=True", but then you have to tune the related hyper-parameters (itd, itc, temp), see https://github.com/drboog/Lafite/blob/a79c66a407dd7996052b6c7c9d77a338380506b4/training/loss.py#L215 In your "batch size=64, gpu num=4, batch_gpu=8, round=2" example, the contrastive loss will be calculated on 8 samples with "--gather=False", on 32 samples with "--gather=True".
I'm not sure about the second question.
Thank you very much for always answering my doubts promptly, even though some questions are not related to the paper. I really appreciate it!
You are welcome :)
First of all thank you very much for your paper, it has been a huge help to me.The project you uploaded has also greatly helped my research. I want to ask you a few questions.
1.Are the results shown in the paper based on "sim = torch.exp(sim/temp), itd=10, itc=20" ? However, what is the result of "sim = sim/temp, itd=5, itc=10"? Under the setting of "sim = sim/temp", is this "itd=5, itc=10" optimal?
2.I am using 4 Nvidia 1080 for training and it takes 15 days for me to run a 25000kimg experiment, I would like to know your equipment and how long it will take to run one training session.