Closed NotNANtoN closed 3 years ago
I played around with the following image:
I used it as the "img" input to the Imagine model to generate this (does not look amazing, but it works):
I used the create_img_encoding and create_text_encoding function from Imagine to get the encoding for the hot-dog image and the sentence "Yellow" and took the average of them. I fed this encoding in Imagine to generate this:
And this with "Pink":
And with something more abstract "Love is the answer!":
Edit: I stand corrected. This is pretty cool!.
@NotNANtoN i'm not sure if you're aware of this, but I believe this feature is already implemented. Although the saving and manipulating of CLIP embeds is cool stuff!
Oh i see you've added quite a few more knobs to turn than the original implementation. Apologies.
@NotNANtoN looks great! thank you for the contribution :)
Added:
Please feel free to tell me what things to change or to adapt it as you like. I just thought other people would appreciate these features too.