CompVis / taming-transformers

Taming Transformers for High-Resolution Image Synthesis
https://arxiv.org/abs/2012.09841
MIT License
5.82k stars 1.15k forks source link

A question regarding the embed_dim parameter in the config file. #253

Open bazingayu opened 4 months ago

bazingayu commented 4 months ago

Hi,

Thank you for your excellent work.

I have a question regarding the embed_dim parameter. In all the configurations in this repository, the embed_dim is set to 256, which appears to be the latent vector dimension. However, in the latent diffusion repo, the latent diffusion model's first stage model configurations, the embed_dim parameter is always set to 3.

Could you explain the reason for this? Are there any significant differences between these two repositories?

I look forward to your response.

Best regards,