deepglugs / dalle

9 stars 4 forks source link

Not an issue - question on training set #1

Open johndpope opened 3 years ago

johndpope commented 3 years ago

did you see this? 78k image-caption pairs (3.8 GB zipped) https://github.com/lucidrains/DALLE-pytorch/issues/7

deepglugs commented 3 years ago

I did not see that. thanks for pointing that out. Besides my own dataset, I've been training on gwern's anime figures with tags (as you know). https://www.gwern.net/Crops. I'm only training on 27k images but results haven't been great. Dalle is really slow to train and I've only been training when I have spare cycles. Loss keeps going down though, so we'll see.

blonde_hair

black_hair

johndpope commented 3 years ago

I have some spare gpu cycles if you can help spoon feed the work / I'll run it on my box.

deepglugs commented 3 years ago

I've uploaded the models I'm using here: https://mega.nz/folder/XF1CzR4T#E4Mc9IXMZRNLqW4yGkcToA To train the danbooru2019 model, I use the following:

python3 dalle.py --source ~/nas2/ai/datasets/danbooru2019-figures/classifier/ --vocab curated_512.vocab \
                           --vae vae_dan2019_cb2048_256px.pt --dalle dalle_an2019_cb2048_256px_2.pt \
                           --train_dalle --batch_size=1 --vgg_loss --samples_out samples/dan2019_figures/dalle \
                           --vae_layers=3 --size 256 --codebook_dims=2048 --epochs=1 --depth 20

The dalle model is huge. It's uploading now. After that's done I'll upload the dataset. It's a somewhat small part of the larger figures dataset, so you may want to get run rsync and get a larger set. I think I have a script that makes tag/label files from the image set, but I'll have to look around.

Also note that training this large of a dalle model requires 24gb of memory at least. I haven't been able to train anything over depth 20 because of the lack of memory. Apparently depth 20 is still rather tiny and may not be able to converge.

deepglugs commented 3 years ago

my dataset is here: https://mega.nz/file/WMkiCYIR#Kuw4Etoj_VD1kPrizJo4IpMM5nq5ll-5nzkZMqGghhk