Closed Patrickliuu closed 6 months ago
@Patrickliuu I've seen your generated images posted on teams and those looked very promising! Can you write down what the architecture was that you used there, so I can mention that for the pitch?
@GeroVanMi The model is taken over from the example you shared: https://huggingface.co/docs/diffusers/main/en/tutorials/basic_training#train-a-diffusion-model
UNet architecture: from diffusers import UNet2DModel
model = UNet2DModel( sample_size=config.image_size, # the target image resolution in_channels=3, # the number of input channels, 3 for RGB images out_channels=3, # the number of output channels layers_per_block=2, # how many ResNet layers to use per UNet block block_out_channels=(128, 128, 256, 256, 512, 512), # the number of output channels for each UNet block down_block_types=( "DownBlock2D", # a regular ResNet downsampling block "DownBlock2D", "DownBlock2D", "DownBlock2D", "AttnDownBlock2D", # a ResNet downsampling block with spatial self-attention "DownBlock2D", ), up_block_types=( "UpBlock2D", # a regular ResNet upsampling block "AttnUpBlock2D", # a ResNet upsampling block with spatial self-attention "UpBlock2D", "UpBlock2D", "UpBlock2D", "UpBlock2D", ), )
[ ] look into actual hugging face models, resp. into stable diffusion: https://huggingface.co/docs/diffusers/main/en/tutorials/basic_training#train-a-diffusion-model or https://www.kaggle.com/code/digvijayyadav/pokemon-generator-gans
It's defintily going to be a GAN!