lucidrains / deep-daze

Simple command line tool for text to image generation using OpenAI's CLIP and Siren (Implicit neural representation network). Technique was originally created by https://twitter.com/advadnoun
MIT License
4.37k stars 327 forks source link

Fix normalization, VRAM usage, and new story feature #58

Closed NotNANtoN closed 3 years ago

NotNANtoN commented 3 years ago
NotNANtoN commented 3 years ago

Please give it a try, especially the create_story feature. If you turn on save_progress, nice movies can be made. I set it up such that it optimizes for 1 episode on three words, then for the next episode it adds 3 more words etc. (old words are kicked out if the CLIP context length is reached).

I'm generating some dream stories at the moment. I could update the README too and put a story in there, along with explanations of the new img feature.

Also, as the VRAM issue is now fixed I can (with my 8GB RTX 2060 Super) run a 44 layer net with a batch size of 96 on a resolution of 256. For a 512 resolution I have not yet found a good setup.

lucidrains commented 3 years ago

Looks great! Merging! :D