dzryk / antarctic-captions

MIT License
111 stars 10 forks source link

Possible to Train non-COCO? #2

Open afiaka87 opened 3 years ago

afiaka87 commented 3 years ago

I see this only supports the COCO dataset. Any plans to support something a little more generic like an ImageTextFolder or WebDataset?

I helped write this implementation (and there's also support for webdataset in that repo):

https://github.com/lucidrains/DALLE-pytorch/blob/main/dalle_pytorch/loader.py

afiaka87 commented 3 years ago

This would work with COCO btw - which stores images and text files in this fashion.

dzryk commented 3 years ago

Definitely. I'll look into this, probably next week. Thanks!

afiaka87 commented 3 years ago

Definitely. I'll look into this, probably next week. Thanks!

No - thank you!

dzryk commented 3 years ago

TextImageDataset is now the default. I have only tested on coco so far. If you try other datasets, please let me know if you encounter any errors. I'll look into supporting WebDataset later on.

dzryk commented 3 years ago

I've begun working on webdataset support here: https://github.com/dzryk/cliptalk

That project will be able to handle more generic image->text settings and uses GPT models instead of BART.

I'll add webdataset here for completion once I've confirmed things work over there