Closed afiaka87 closed 3 years ago
Off topic: It's tough to figure out what SIREN + CLIP will "latch onto", but faces is definitely one of those things. It's very good at taking the existing face of an image and converting it to the face of your description, so long as the face is popular enough at least. Here's worf from star trek on a photo of Mac's mom from the show It's Always Sunny in Philadelphia
Input:
SIREN representation:
After 100 iterations of training on "worf":
Early output(phrase: "worf"):
@lucidrains thanks!
We discussed this elsewhere, but just to be rigorous -
As it stands, I think priming only works on about 16-20 layers. Otherwise, the loss gets stuck in the 0.08 range. I found it's able to escape this 0.08 value by lowering the learning rate.
Now what would really be nice is if we found good rates for certain layer counts. In the meantime, I just made it tweakable from the Imagine interface and the CLI. Here's the code -
https://github.com/lucidrains/deep-daze/pull/38