thuanz123 / enhancing-transformers

An unofficial implementation of both ViT-VQGAN and RQ-VAE in Pytorch
MIT License
284 stars 34 forks source link

stage2 transform #16

Closed ghost closed 1 year ago

ghost commented 1 year ago

When you run out of the task, how many steps you run will have good images. At present, I have run 20,000 steps or there is no good image. I wonder if there is a problem with your production code?

thuanz123 commented 1 year ago

What is the size of your dataset ? If I remember correctly, it takes like 10 epochs of training on imagenet for the output to be good. So that is about 12M iterations

thuanz123 commented 1 year ago

Oh sorry I mis-calculate a bit. 10 epochs on ImageNet with batch size of 128 is about 100K iterations. But the dataset size is also matters, it does not work well with small dataset

ghost commented 1 year ago

Dataset is FFHQ, during the 12000 steps, the outputs are very strages, so i am worried about the model is wrong. From: "Thuan H. @.> Date: Sun, Jun 25, 2023, 13:05 Subject: [External] Re: [thuanz123/enhancing-transformers] stage2 transform (Issue #16) To: "thuanz123/enhancing-transformers"< @.> Cc: @.>, "Author"< @.>

What is the size of your dataset ? If I remember correctly, it takes like 10 epochs of training on imagenet for the output to be good. So that is about 12M iterations

— Reply to this email directly, view it on GitHub https://github.com/thuanz123/enhancing-transformers/issues/16#issuecomment-1605873128, or unsubscribe https://github.com/notifications/unsubscribe-auth/A7ZMO7CTI2H56JQKXCKZNI3XM7BKNANCNFSM6AAAAAAZS5KQ4A . You are receiving this because you authored the thread.Message ID: @.***>

thuanz123 commented 1 year ago

Actually, only the stage1 code is good. The code for stage2 is not good. Most of the time, it produces bad results but sometimes good so I recommend using another code until I can find what wrong

ghost commented 1 year ago

Thank you. When dataset is small, it can not generate good images? And I take conds as " conds = torch.zeros(images.shape[0], 1, device=device, dtype=torch.long)" for unconditional generation, shape is (b, 1). Do you think it is correct? From: "Thuan H. @.> Date: Sun, Jun 25, 2023, 13:09 Subject: [External] Re: [thuanz123/enhancing-transformers] stage2 transform (Issue #16) To: "thuanz123/enhancing-transformers"< @.> Cc: @.>, "Author"< @.>

Oh sorry I mis-calculate a bit. 10 epochs on ImageNet with batch size of 128 is about 100K iterations. But the dataset size is also matters, it does not work well with small dataset

— Reply to this email directly, view it on GitHub https://github.com/thuanz123/enhancing-transformers/issues/16#issuecomment-1605873770, or unsubscribe https://github.com/notifications/unsubscribe-auth/A7ZMO7G5GI5K2SBLIGB7K4DXM7BW5ANCNFSM6AAAAAAZS5KQ4A . You are receiving this because you authored the thread.Message ID: @.***>

ghost commented 1 year ago

Do you have any suggestions? Maybe VQGAN or RA-Transformer official codes? From: "Thuan H. @.> Date: Sun, Jun 25, 2023, 13:12 Subject: [External] Re: [thuanz123/enhancing-transformers] stage2 transform (Issue #16) To: "thuanz123/enhancing-transformers"< @.> Cc: @.>, "Author"< @.>

Actually, only the stage1 code is good. The code for stage2 is not good. Most of the time, it produces bad results but sometimes good so I recommend using another code until I can find what wrong

— Reply to this email directly, view it on GitHub https://github.com/thuanz123/enhancing-transformers/issues/16#issuecomment-1605874448, or unsubscribe https://github.com/notifications/unsubscribe-auth/A7ZMO7CDLMTONPT2GGJ6GFLXM7CELANCNFSM6AAAAAAZS5KQ4A . You are receiving this because you authored the thread.Message ID: @.***>

ghost commented 1 year ago

Thank your reply. Another question, what numbers of your gpus? 8 A100-80G? The experiments take long time. Can you reproduce the metrics same as papers for stage 1 and stage2. From: "Thuan H. @.> Date: Sun, Jun 25, 2023, 13:12 Subject: [External] Re: [thuanz123/enhancing-transformers] stage2 transform (Issue #16) To: "thuanz123/enhancing-transformers"< @.> Cc: @.>, "Author"< @.>

Actually, only the stage1 code is good. The code for stage2 is not good. Most of the time, it produces bad results but sometimes good so I recommend using another code until I can find what wrong

— Reply to this email directly, view it on GitHub https://github.com/thuanz123/enhancing-transformers/issues/16#issuecomment-1605874448, or unsubscribe https://github.com/notifications/unsubscribe-auth/A7ZMO7CDLMTONPT2GGJ6GFLXM7CELANCNFSM6AAAAAAZS5KQ4A . You are receiving this because you authored the thread.Message ID: @.***>

thuanz123 commented 1 year ago
  1. Yeah you can do simple thing like zero latent code for unconditional
  2. If you want to re-use the provided pretrained weight then you must use VQ-GAN code, if you want to use RQ-VAE code then you must train a compatible stage1 model
  3. Training ViT-VQGAN takes a really long time. In the paper they use like 128 TPUs which I certainly do not have access to so I cannot match their numbers unfortunately
ghost commented 1 year ago

Thanks for your reply. It is very helpful for me. Yeah, I think the reason that ViT VQGAN takes a long time is transformer architecture, it really needs a lot of time. From: "Thuan H. @.> Date: Sun, Jun 25, 2023, 13:23 Subject: [External] Re: [thuanz123/enhancing-transformers] stage2 transform (Issue #16) To: "thuanz123/enhancing-transformers"< @.> Cc: @.>, "Author"< @.>

  1. Yeah you can do simple thing like zero latent code for unconditional
  2. If you want to re-use the provided pretrained weight then you must use VQ-GAN code, if you want to use RQ-VAE code then you must train a compatible stage1 model
  3. Training ViT-VQGAN takes a really long time. In the paper they use like 128 TPUS which I certainly do not have access to

— Reply to this email directly, view it on GitHub https://github.com/thuanz123/enhancing-transformers/issues/16#issuecomment-1605876535, or unsubscribe https://github.com/notifications/unsubscribe-auth/A7ZMO7COBD5VWEZFOSWCZX3XM7DL3ANCNFSM6AAAAAAZS5KQ4A . You are receiving this because you authored the thread.Message ID: @.***>