Open b4zz4 opened 1 year ago
I have a use case I’m trying to think how to use SD for. Fashion. Take a model in one photo and an article of clothing in another. Can we merge them together in a nice way with a prompt and say inpainting guidance?
Seth
On Wed, Dec 28, 2022 at 7:49 PM Bazza @.***> wrote:
Hi. Can you think of a way to infer between two images?
I thought subtracting the latency between two images, but that doesn't seem to work because it doesn't infer the error (noise prediction).
— Reply to this email directly, view it on GitHub https://github.com/bes-dev/stable_diffusion.openvino/issues/112, or unsubscribe https://github.com/notifications/unsubscribe-auth/A3Y67FK2ZBMCFPCVO373L73WPTUY5ANCNFSM6AAAAAATLWK224 . You are receiving this because you are subscribed to this thread.Message ID: @.***>
I think you can use an image VAE as a tokenizer (in addition to the prompt) but I don't really know how to calculate this error. This would at least produce an image the reference style.
maybe the functionality we both want is to be able to do img2img but with more than 1 input image in addition to the prompt and the model. I'm just a hobbyist user I have a technical background but haven't written code in 15 years so can't really create that myself.
On Sat, Dec 31, 2022 at 8:26 AM Bazza @.***> wrote:
I think you can use an image VAE as a tokenizer (in addition to the prompt) but I don't really know how to calculate this error. This would at least produce an image the reference style.
— Reply to this email directly, view it on GitHub https://github.com/bes-dev/stable_diffusion.openvino/issues/112#issuecomment-1368222419, or unsubscribe https://github.com/notifications/unsubscribe-auth/A3Y67FPBBEZ7UIFAXAWMOG3WQA7B5ANCNFSM6AAAAAATLWK224 . You are receiving this because you commented.Message ID: @.***>
That already tries and creates a literal mix between the images. I think if you find the space of latency it would be more appropriate, not so literal but a space where you find what the prompt indicates
Hi. Can you think of a way to infer between two images?
I thought subtracting the latency between two images, but that doesn't seem to work because it doesn't infer the error (noise prediction).