Open senmaoy opened 1 year ago
Paper:DF-GAN, Dataset: CUB Results: IS=4.5,FID=23.34, Contributor: Senmao Ye, GPU: The result is obtained with a 2080 TI and nf=32 on CUB Training Time: 48 hours Comments: A simple and efficient baseline for text-to-image. The code is really friendly for beginners
Paper:Attn-GAN, Dataset: CUB Results: IS=4.02, Contributor: Senmao Ye, GPU: The result is obtained with a titian pascal(12GB) Training Time: more than 2 days Comments: I have tried to remove the attention model which leads to similar IS results (I'm not sure about it). What's more, I'm curious about the attention maps. I hope that someone could share some experiences about the attention maps.
Paper: Generative text-to-image synthesis, Dataset: CUB Results: IS=2.81, Contributor: Senmao Ye, GPU: The result is obtained with a titian pascal(12GB) ,Batch_size = 64 Training Time: more than 3 days Comments: The code is very slow and sometimes tends to collapse. I found that the text encoder is really big with a CNN and a RNN
@senmaoy, did you train AttnGAN, or you used pretrained model ?
Hey, I train AttnGAN on my own.
@senmaoy, do you have the code ?
do you want it?
Did you successfully train the DAMSM and AttnGAN ?
I just use the stackgan backbone from attngan
@senmaoy, why didn't you train the original AttnGAN, why did you remove the attention model ? The attention model is an important feature in AttnGAN.
Hey, I just wondering if the attention blocks work.
Does it work ?
I'm not sure, but the results are similar to the ones in the paper
VeryIdiotUser @.***> 于2023年7月17日周一 23:50写道:
Does it work ?
— Reply to this email directly, view it on GitHub https://github.com/senmaoy/Easy-Text-to-Image/issues/1#issuecomment-1638414363, or unsubscribe https://github.com/notifications/unsubscribe-auth/ACR5CVBTMJS6GJSMSEKFVHTXQVNNRANCNFSM54QXRPGQ . You are receiving this because you were mentioned.Message ID: @.***>
Hey @senmaoy, if the text in the text to image problem represent how the output image is generated, then what does the latent random variable z in GAN used for?
hey, it is used for sampling stochasticly.
You mean that the same text generate different result each time ?
If you have any questions about my experiences, just reply to this issue. I will upload more information