Open kumatheworld opened 1 year ago
I tried the following tiny model from diffusers, but the optimization step takes ~1s per batch of 64 images on my Apple M1 PC. Perhaps you might want to define a lightweight architecture on your own.
from diffusers import UNet2DModel
UNet2DModel(
sample_size=64,
in_channels=1,
out_channels=1,
layers_per_block=1,
block_out_channels=[32, 32],
down_block_types=["ResnetDownsampleBlock2D", "ResnetDownsampleBlock2D"],
up_block_types=["ResnetUpsampleBlock2D", "ResnetUpsampleBlock2D"],
)
Since the current VAEs and GANs don't look satisfactory, you might want to try some diffusion models. Hugging Face's diffusers will be a good starting point.