lucidrains / imagen-pytorch

Implementation of Imagen, Google's Text-to-Image Neural Network, in Pytorch
MIT License
8.11k stars 768 forks source link

text-image dataset #223

Open BIG-PIE-MILK-COW opened 2 years ago

BIG-PIE-MILK-COW commented 2 years ago

Is there any text-image dataset that is not so large(containing less than one hundred thousand pairs)?

kyle-cx91 commented 2 years ago

I'd like to ask the same question.

I was going to use the Laion-400 for training experiments, but found it was too large, I have only an NVIDIA GeForce RTX 3060. Then I found Flickr30k, The size of the picture in Flickr30k is arbitrary, but Imagen-pytorch seems to require the same length and width of the image, and the same size of different images

@lucidrains looking forward to your reply

kyle-cx91 commented 2 years ago

@lucidrains Can I simply use this to resize the image? Wouldn't that be too rude?

                               transform=torchvision.transforms.Compose([
                                   torchvision.transforms.Resize(64, 64)
                               ]))


@BIG-PIE-MILK-COW maybe u can take a look at Flickr30k, So little data probably won't give good results, but start with it maybe not a bad idea, It looks like we've all just started training this model and would love to communicate training experience with you

BIG-PIE-MILK-COW commented 2 years ago

@lucidrains Can I simply use this to resize the image? Wouldn't that be too rude?

                               transform=torchvision.transforms.Compose([
                                   torchvision.transforms.Resize(64, 64)
                               ]))

@BIG-PIE-MILK-COW maybe u can take a look at Flickr30k, So little data probably won't give good results, but start with it maybe not a bad idea, It looks like we've all just started training this model and would love to communicate training experience with you

I have tried training with laion-art which is a subset of laion5B, but didn't get a good result.

TheFusion21 commented 2 years ago

I have tried training with laion-art which is a subset of laion5B, but didn't get a good result.

How many steps did you train for? How do your Unets look like?

BIG-PIE-MILK-COW commented 2 years ago

I have tried training with laion-art which is a subset of laion5B, but didn't get a good result.

How many steps did you train for? How do your Unets look like?

I trained for 200000 steps, I use only one unet. Here is my uet: unet = Unet( dim=32, cond_dim=512, dim_mults=(1, 2, 4, 8), num_resnet_blocks=3, layer_attns=(False, True, True, True), layer_cross_attns=(False, True, True, True) )

lucidrains commented 2 years ago

you can't expect miracles being frugal on data

zhangnan-hust commented 1 year ago

How should the dataset be loaded?