lucidrains / DALLE-pytorch

Implementation / replication of DALL-E, OpenAI's Text to Image Transformer, in Pytorch
MIT License
5.55k stars 643 forks source link

Fmap size 16 works, whereas Fmap size of 8 doesn't work #402

Open snoop2head opened 2 years ago

snoop2head commented 2 years ago

https://github.com/lucidrains/DALLE-pytorch/blob/4a7958dc0c3313dee91dda5603741139ee5483e3/dalle_pytorch/vae.py#L178

lucidrains commented 2 years ago

why wouldn't it work? what's the full error you are seeing?

snoop2head commented 2 years ago

Overview

Personal Speculation

It seems like num_layers and fmap size defined in two different classes(VQGanVAE and DALLE) caused the problem.

For the fmap size and num_layers in vae.py, it is stated as the following:

https://github.com/lucidrains/DALLE-pytorch/blob/4a7958dc0c3313dee91dda5603741139ee5483e3/dalle_pytorch/vae.py#L177-L178

Whereas in dalle_pytorch.py, image_fmap_size is the following:

https://github.com/lucidrains/DALLE-pytorch/blob/1ad3ab89898a58c4de83c1e82c830f302bbee07c/dalle_pytorch/dalle_pytorch.py#L337

Code-wise they should coincide with the one another, but error log showed that it doesn't.

Will attach the log and code here for the reference.