Open mikestreed opened 11 months ago
Oh also, ability to paste generation data copied from CivitAI giving the exact settings ready to render.
Thanks for the suggestions :)
Agree very strongly with the OP on pretty much all of these requests. Adding my voice to the : Missing many popular samplers. DPM++ 2M Karras is by far the most popular sampler overall so should probably add at least that one. crowd. Draw Things is another free Stable Diffusion Mac app that runs locally and has supported these features for a while now, but is not as user-friendly as Diffusion Bee.
Adding my voice to the : _Missing many popular samplers. DPM++ 2M Karras
FYI "Karras" is now showing as an option in the newest Beta version from earlier this week (which also now works with SDXL base)... unclear which version of Karras has been implemented but I would assume it's this version since it's what I asked for. TBF I don't think the sampler makes THAT much difference, after testing.
Adding my voice to the : _Missing many popular samplers. DPM++ 2M Karras
FYI "Karras" is now showing as an option in the newest Beta version from earlier this week (which also now works with SDXL base)... unclear which version of Karras has been implemented but I would assume it's this version since it's what I asked for. TBF I don't think the sampler makes THAT much difference, after testing.
I noticed that "Karras" is there as well, and I've been trying it out. I will strongly disagree with you regarding your opinion that it doesn't make that much difference. I did a comparison between DDIM, lsmd, k_euler, and "Karras" using the same prompt and same model, and there is a huge difference between some of them.
99% agree. my addition:
sort the model menu and the model library: name, date added, something like that.
I updated this list a bit, now that some things are fixed and I've added a few new points too. plus reordered the remaining to what IMO will be biggest priority for (advanced) users. (I'm sure non-advanced users are very happy and don't need any of these tweaks). Thanks again!
Super glad to see this app is still being updated. Was kinda worried it had been abandoned since July - it's the best-working / least buggy + fastest SD UI on my current system!
No point me making a separate 'issue' for each of these, since not really issues but just ideas of how it can be improved in future. Just bringing all to your attention for possible eventual inclusion when you get chance. I've been meaning to make this list since I began using the app around 9 months ago.
WISHLIST - most important to least important
Add the new IP Adapter, including faceID features. This is one of the most powerful and soon-to-be critical tools in all of stable diffusion, very important for character consistency etc.
Add ControlNet Reference Only and Canny edge detector, and.or allow users to add new controlnet models if desired.
Queue management. Ability to see the queue and amend jobs that haven't started yet (maybe just as a text file?). Also, moist importantly, the ability to cancel the current job of the queue without cancelling the whole queue.
'Advanced Options' should hide the 'styles' menu entirely and should expand+combine the 'diffusion, seed, misc' menus, since most times an advanced user will need all of them. Optimise the layout a little to require less scrolling, wasting less screen real estate in the advanced mode setup panel. Also some sliders could have easier to use ranges. eg CFG should be rounded to 1 decimal place and could be log scale (so halfway is about 3.0). Also prompt and negative prompt boxes should grow if the prompt is longer than the box (which they usually are for an advanced user).
Bring back batch size. I understand that with a more powerful MacBook, the generations dot necessarily get faster, BUT at least if you can run multiple in parallel the it's effectively the same result.
Allow users to make generation presets that can recall easily, ie with LCM (fast) models I wanna always use Euler-A, 6 steps, CFG 1.5 but with most other models I wanna always use DDIM, 25 steps, CFG 8 (as a start point).
A little button next to 'resolution' which swaps the horizontal and vertical dimensions. I usually only use 768x512 or 512x768 and switch between these all the time, which is tedious to do since it takes a bunch of clicks currently.
incorporate the newest video modes - combine them all on one "video" tab (adding the existing deforum & interpolator there too): SVD (stable videodiffusion) ANIMATE DIFF ANIMATE ANYONE and also STABLE-FAST (which speeds up generations somehow I believe)
Add generation info to image file metadata, so it's findable later even after cleaning the history. ComfyUI lets you drag+drop and image to recall the exact generation settings... this would be VERY useful.
history: add to 'actions' menu an option to send prompt/setup to text2img (don't auto queue the prompt, it'll be a start point to edit before queuing/running it)
history: add a "show in finder" option on the actions menu (IMO "save image" is kinda useless since the image is already saved on the hard drive, people just need a way to get to it easily).
also add some of the newer SDXL turbo (one step) and lightning checkpoints. Personally I still cant use regular XL coz my computer is way too slow when doing 15-20+ steps on 1024x1024. Also consider adding the segmindSSD1B SDXL checkpoint which apparently has been made smaller and optimised to run faster than regular SDXL.
Add hi-res fix (like in Auto1111) and face fix options, like face detailer for ComfyUI, which finds faces then inpaints them in an upscaled way for better quality, since small faces in the frame are usually messed-up looking. (contained as on of the modules here: https://github.com/ltdrdata/ComfyUI-Impact-Pack)
Option to increment the seed by 1 instead of by 1234 (your pseudo-random method)
Previews of the generations as they happen, eg every 5 steps (as in Automatic1111). Then allow user to 'skip' (ie abandon) that image and move to the next (eg if the composition is not good, there's no reason to finish the remaining steps before moving to the next image)
Add a request to "relocate" a moved file instead of just "model not found" error, thereby letting users move huge files off their main hard drive without having to manually edit the JSON.
VAE support... allow adding a VAE to models that don't have one baked in.
Allow custom inpainting models (from civil.ai etc)
Directly use safetensors instead of converting to TDICT? All other programs seem to be able to use the standard safetensors file without converting, and many people use more than one UI, which means we're using double the disk space storing the same checkpoint/model twice in 2 formats... and at 2 or 4 GB each with 20+ models, this is a ton of wasted space if it's possible to work straight from the safetensor file. I'm presuming there's a very good reason why you haven't done this already though, so assuming it's not possible. All other UIs seem to run much slower than DiffusionBee on my 8GB MacBook Air, so maybe that's the reason.
Embeddings / Textual inversions support would also be great (basically like simpler much smaller loras)
More/better upscale models, or ability to add them from elsewhere, and option to have it automatically apply to every render. TBH after trying these the diff is minimal.
TRAINING TAB ... maybe you can check this out before implementing the training tab, to allow training on less than 8GB of RAM: https://www.reddit.com/r/StableDiffusion/comments/17bksq5/simple_loradreambooth_trainer_trains_sd15_loras/
Thanks for all the hard work on this awesome app! Lemme know any questions / if any of these aren't clear what I mean.
Updated the list, moving the fixed ones down here:
DONE-- Full lora support - not only lora-merging tools but the ability to use them normally from any prompt, as intended, inc ability to use any number of them together (eg a user might want to use LCM lora + Adetailer + an art-style lora + a pose lora + a character lora + an outfit lora). This is a key feature of the most popular UIs like AUTO1111 and ComfyUI, and DiffusionBee is still quite limited without it. Typically they'd just be placed in a finder folder and called when needed.
DONE AFAIK (I read that this is done, haven't needed to try it since Loras were implemented): Lora tools fix - some lora's fail to merge, giving: "Error: down_blocks 0 downsamplers 0" (eg the LCM lora for SD1.5, which makes generation much faster) OR "Error: module 'numpy' has no attribute 'bfloat16' " (happens with lots of other loras). [BTW my OS is Monterey, lemme know if that's the issue, though I'd rather not update OS coz it will likely break some other apps I need].
+DONE: history: don't go back to top of the page when clicking away to a diff tab and coming back
DONE (at least added Karras): Missing many popular samplers. DPM++ 2M Karras is by far the most popular sampler overall so should probably add at least that one.
DONE I THINK: Ability to import TDICT files (in case you wanna remove models and re-add them later without reconverting... or eg when updating DiffusionBee versions). I manually add them / change names / reorder them in the JSON file but that's quite tedious.