CompVis / taming-transformers

Taming Transformers for High-Resolution Image Synthesis
https://arxiv.org/abs/2012.09841
MIT License
5.73k stars 1.14k forks source link

How to train the transformer after training the vqgan model? #232

Closed RichardXue123 closed 8 months ago

RichardXue123 commented 9 months ago

I followed the section "Training on custom data" and got a vqgan model. How to train the transformer using my own vqgan model?

RichardXue123 commented 9 months ago

The transformer and gan are trained at the same time.

then how could i use the transformer & gan to generate a image?

gdjmck commented 9 months ago

The transformer and gan are trained at the same time.

then how could i use the transformer & gan to generate a image?

you could check out the colab link in readme.

Ontheroad123 commented 9 months ago

The transformer and gan are trained at the same time.

the 'custom_vqgan.yaml' load the model 'taming.models.vqgan.VQModel', without 'Net2NetTransformer', why transformer is trained at the same time?

RichardXue123 commented 8 months ago

The transformer and gan are trained at the same time.

the 'custom_vqgan.yaml' load the model 'taming.models.vqgan.VQModel', without 'Net2NetTransformer', why transformer is trained at the same time?

So if I want to use my custom data to train using the source code of this project, then I can only train the VQGAN network without the transformer, right? And I can only do image reconstruction and not image generation, is that true?

Ontheroad123 commented 8 months ago

The transformer and gan are trained at the same time.

the 'custom_vqgan.yaml' load the model 'taming.models.vqgan.VQModel', without 'Net2NetTransformer', why transformer is trained at the same time?

So if I want to use my custom data to train using the source code of this project, then I can only train the VQGAN network without the transformer, right? And I can only do image reconstruction and not image generation, is that true?

VQGAN have two stages, first stage is encoder and decoder, second stage is transformer and decoder for generation, so if you do image reconstruction, i think it's same as vqvae.

RichardXue123 commented 8 months ago

i did it by writing a transformer yaml file based on my custom vqgan model.

zxb-0 commented 7 months ago

我通过编写一个基于我的自定义 VQGAN 模型的 transformer YAML 文件来做到这一点。

@RichardXue123 I would like to see your customized YAML file, can I

RichardXue123 commented 7 months ago

我通过编写一个基于我的自定义 VQGAN 模型的 transformer YAML 文件来做到这一点。

@RichardXue123 I would like to see your customized YAML file, can I

custom_transformer.txt

zxb-0 commented 7 months ago

我通过编写一个基于我的自定义 VQGAN 模型的 transformer YAML 文件来做到这一点。

我想看看你定制的YAML文件,我可以吗

custom_transformer.txt

@RichardXue123 thank you very much