DominikDoom / a1111-sd-webui-tagcomplete

Booru style tag autocompletion for AUTOMATIC1111's Stable Diffusion web UI
MIT License
2.55k stars 306 forks source link

XL embedding previews. #284

Closed Symbiomatrix closed 5 months ago

Symbiomatrix commented 5 months ago

Hey dom, found this odd little piece of hardcode in tagautocomplete.js , I think it might be in need of an update.

async function updateSelectionStyle(textArea, newIndex, oldIndex) {
[...]
let previewTypes = ["v1 Embedding", "v2 Embedding", "Hypernetwork", "Lora", "Lyco"];
[...]
    switch (selectedType) {
                case "v1 Embedding":
                case "v2 Embedding":
                    shorthandType = "embed";
                    break;
DominikDoom commented 5 months ago

Yeah, this old method was pretty crappy. Never noticed since I'm still mainly using SD 1.5 models, lol.

Symbiomatrix commented 4 months ago

Cheers.

On a side note, I think XL's made a lot of major strides since its introduction about 8 months ago, you may want to check it out again (provided you've the hardware for it). Embeddings are still few and far between, unlike 1.5's digitising the entirety of existence pretty much by now, but the general checkpoint level of control is much more diverse than before. After a while, I feel like going back is akin to my experience revisiting games from the 00s: the quality difference suddenly pops out in a staggering way, noticing many flaws which imagination or memory must've smoothed over.

DominikDoom commented 4 months ago

Any recommendations for SDXL models? I tried a few from Civitai over the months and was always rather disappointed by how well they follow the prompt compared to 1.x models. Or maybe I just need to adjust my default prompts to work better with SDXL, admittedly I haven't experimented much on that front.

Symbiomatrix commented 4 months ago

Any recommendations for SDXL models?

Well, it's been taken in quite a few directions, so that depends on what you're aiming for. Here's a quick rundown (mind you, I'm far from the biggest authority on XL developments). Animagine 3.1 appears to be the most popular anime related finetune. Standard booru style prompting, seems to do a good job, most of the recent animated loras in the XL section will have been trained on / for it. Pony V6 XL has been the most shocking development in the XL ecosystem: It's a major overhaul / retraining of XL on ~2.6m booru images, extensively tagged and rated. Due to a training error, each gen needs to start with the incantation score_9, score_8_up, score_7_up, score_6_up, score_5_up, score_4_up,, but after that it has extremely accurate tag adherence, I believe much more consistent and varied than any 1.5 model offered, even without the ~x2 native resolution. It's such a major development, that a base model "pony" was dedicated to it on civ within a week of its publication, and creators have been publishing models for it (artists, concepts, characters mainly) on a daily basis, in an amount comparable to 1.5 in its heyday. They say that by now virtually all (asian) creators have shifted from 1.5 to pony, that seems close to the truth. Autismmix is one of the major pony variants, supposedly it's intended to improve the style consistency on pony (but I wouldn't know). There have been some attempts to capitalise on pony's accuracy in realism (that same disappointment to which you refer still plagues many a model), and they've seen moderate success; the most popular one is zovya's everclear. It's more semirealistic, but preserved much of pony's accuracy. Full realism is a WIP so far. In terms of classical XL models, juggernaut (generalist) and leosam's helloworld (artistic) are the most frequently updated & popular, plus realvis (realism). I tested a few of them, and whilst there has been a noticeable improvement in output quality over the early models, I think they're still much more rigid than 1.5, "generic" is how I'd describe it, and definitely more difficult to prompt; I did eventually find one I liked personally, jibmix. For speed / video (frame) generation via animateddiff, "turbo" and "lightning" models have been developed; to my understanding, it's some kind of magic lora that reduces the number of steps you need and cfg to 1-10 / 1-4 respectively, so their gen time is comparable to 1.5, with the drawback of being slightly less detailed & stable (hence the guidance of eg controlnet is required, ideal for vids). It's usually merged as a alternative version in major models (eg juggernaut has lightning, pony has turbo). Lightning is the newer variety, it takes a few extra steps but is much more accurate compared to turbo, so it's the better tradeoff.