Closed NicolasWinckler closed 3 months ago
Thank you for your interest in our work.
The training of diffusion is divided into two parts, training VAE first and then training diffusion. The training of each part takes about a week on a single v100, and half the time on one a100. If you wish train these two parts from scratch, You can modify the config
in train_wheat.py
and set resume_path to None
. To train the VAE, you should also run
pip install -e git+https://github.com/CompVis/taming-transformers.git@master#egg=taming-transformers
to intsall taming.
Please note that we have not tested the compatibility of training these two models in this repository.
Layout condition is inserted by "+", while the crossing attention is used for domian features condition.
Thanks a lot for the clarification
Hi,
thanks for the paper and code. I have a few questions. Do you have the code to train the diffusion model with MAE features? How long did it take to train this part? on what hardware?
Concerning the paper, you explain that conditionning on layout is better with controlnet than with cross-attention, but you show in figure 2 stage 2 an input to cross attention. What does that mean?
Thanks for your help