Extraltodeus / multi-subject-render

Generate multiple complex subjects all at once!
369 stars 26 forks source link
automatic1111 stable-diffusion stable-diffusion-webui stable-diffusion-webui-plugin

multi-subject-render

Generate multiple complex subjects all at once!

Made as a script for the AUTOMATIC1111/stable-diffusion-webui repository.

00165-603508287-DDIM-64-7 5-ac07d41f-20221122154627

Miaouuuuuuuuu!

Jump to examples!

💥 Installation 💥

Copy the url of that repository into the extension tab :

image

OR copy that repository in your extension folder :

image

You might need to restart the whole UI. Maybe twice.

The look

image

OK I know that's a big screenshot

How the hell does this works?

First it creates your background image, then your foreground subjects, then does a depth analysis on them, cut their backgrounds, paste them onto your background and then does an img2img for a smooth blend!

It will cut around that lady with scissors made of code.

image

Explanations of the different UI elements

I will only explain the not so obvious things because I spent enough time making that thing already.

For my example I decided to generate a bowling alley at 512x512 pixels :

00158-2629831387-Euler a-22-7 5-ac07d41f-1233221312123132

image

Note : if you do that, you will need as many lines as foreground images generated.

For my example I made tree penguins :

sdffsdsdfsdffsddsfsfd

image

image

00162-2629838387-Euler a-92-7 5-ac07d41f-20221124054727

The are not really playing bowling because you need fingers. They're just here for trouble.

image

The scary miscellaneous options :

image

Tips and tricks :

Known issues

Credits

Thanks to thygate for letting me blatantly copy-paste some of his functions for the depth analysis integration in the webui.

This repository runs with MiDaS.

@ARTICLE {Ranftl2022,
    author  = "Ren\'{e} Ranftl and Katrin Lasinger and David Hafner and Konrad Schindler and Vladlen Koltun",
    title   = "Towards Robust Monocular Depth Estimation: Mixing Datasets for Zero-Shot Cross-Dataset Transfer",
    journal = "IEEE Transactions on Pattern Analysis and Machine Intelligence",
    year    = "2022",
    volume  = "44",
    number  = "3"
}
@article{Ranftl2021,
    author    = {Ren\'{e} Ranftl and Alexey Bochkovskiy and Vladlen Koltun},
    title     = {Vision Transformers for Dense Prediction},
    journal   = {ICCV},
    year      = {2021},
}

A few more examples

An attempt at recreating the "Distracted boyfriend" meme. Without influencing the directions in which they are looking. 100% txt2img.

00241-2439212203-Euler a-100-7 5-ac07d41f-20221124151538 00287-2439212203-Euler a-100-7 5-ac07d41f-20221124151832 00123-60606195-DDIM-74-7 5-ac07d41f-20221124144302 00133-1894928239-DDIM-74-7 5-ac07d41f-20221124144525

I messed up the order on the last one.

00129-603508287-DDIM-64-7 5-ac07d41f-20221122153921

Aren't they cute without oxygen?

00051-3908280031-DPM++ 2M-74-7 5-ac07d41f-20221122145842

Of course you can make a harem just for yourself.

00165-603508287-DDIM-64-7 5-ac07d41f-20221122154627

MOAR KITTENS

Now a few more groups of "super heroes" from the same batch as the first image here. Except maybe for the portraits.

00290-1347027509-DDIM-69-7 5-579c005f-20221123193425

Wrong settings examples

00145-2998285171-DDIM-92-7 5-ac07d41f-20221124054225

This is what too low denoising on the final blend looks like. Yuk!

00254-1268283421-Euler a-68-7 5-ac07d41f-20221124060832

Same issue here. Looks like a funny kid collage. Grandma will love it because you typed your prompts with love and she knows it.