invoke-ai / InvokeAI

InvokeAI is a leading creative engine for Stable Diffusion models, empowering professionals, artists, and enthusiasts to generate and create visual media using the latest AI-driven technologies. The solution offers an industry leading WebUI, supports terminal use through a CLI, and serves as the foundation for multiple commercial products.
https://invoke-ai.github.io/InvokeAI/
Apache License 2.0
23.22k stars 2.4k forks source link

Suggestion for Web UI Height/Width #113

Closed arothmanmusic closed 2 years ago

arothmanmusic commented 2 years ago

It would be a nice tweak if the /stable-diffusion/static/dream_web/index.html had dropdowns rather than inputs for the Height / Width, with options 64, 128, 192, 256, 320 etc. (maybe up to 1280?) so there's no chance of accidentally using an invalid ratio and getting an error.

bakkot commented 2 years ago

I think checking the values client-side would be sufficient - that way you don't have to wait for the submission to happen. I am not a fan of dropdowns myself.

arothmanmusic commented 2 years ago

Oh, yeah. It could totally be done with a JS validation as well.

I know my way around a computer just fine, but math isn’t my strong suit… I have to keep pulling out the calculator to remind myself which numbers are going to work. :)

lstein commented 2 years ago

@tesseractCat, is this something you'd be willing to implement? The drop-down menus sounds like a good UI element. Even better would be some way of detecting the user's VRAM and forbidding combinations of HxW that would likely exhaust memory.

arothmanmusic commented 2 years ago

@lstein I am a web developer myself, so I would be happy to put together a prototype, although I am brand new to GitHub and don’t know what the proper way to submit the change would be.

Thank you so much for all of your work on this project!

lstein commented 2 years ago

@arothmanmusic That would be fantastic! Making a pull request is pretty easy. There's a great guide here that will tell you all you need to know: https://opensource.com/article/19/7/create-pull-request-github

arothmanmusic commented 2 years ago

@lstein Awesome! I think I'll just automatically round anything that's not a 64x to the closest valid option to avoid the errors.

lstein commented 2 years ago

It looks like @tesseractcat just sent me a PR that turns the text fields into dropdowns. @arothmanmusic, please feel free to contribute any improvements you can think of. One obvious improvement would be a caption for each image that captures the prompt and arguments.

lstein commented 2 years ago

The command-line script does this too and people don’t seem to mind.

Lincoln

On Fri, Aug 26, 2022 at 1:44 PM arothmanmusic @.***> wrote:

@lstein https://github.com/lstein Awesome! I think I'll just automatically round anything that's not a 64x to the closest valid option to avoid the errors.

— Reply to this email directly, view it on GitHub https://github.com/lstein/stable-diffusion/issues/113#issuecomment-1228759466, or unsubscribe https://github.com/notifications/unsubscribe-auth/AAA3EVNH65T2MXYH7AVJ673V3D67DANCNFSM57XH32TQ . You are receiving this because you were mentioned.Message ID: @.***>

-- Written on my cell phone. Anything that seems odd is the fault of auto-correct.

loopyd commented 2 years ago

Direct answer

Appeasing the comment critic about "How do I get amount of free GPU VRAM" before the deep-dive on the UI framework we've chosen's restrictions.

Just as a breadcrumb to conditionally populating:

How to get available GPU video memory (Nvidia, Linux)

nvidia-smi -i 0 --query-gpu=memory.free --format=csv | tail -n1 | cut -d' ' -f1

Example output (MiB):

11264

There are a lot of command line utils that can do this, and need to be done at control construction time. They are os dependent, and thus will need to introduce simple execution forks to check the os on the host, and run with subprocess in py with their stdout going into a variable. glxinfo is also fairly foolproof.

I'm sure this generates a lot of ideas as to how one can conditionally disable certain dropdown items. 🤔

Here's where it goes off the deep end. You can stop reading here if you just wanted the direct answer.

In the current repository state of many of these projects, you can only do it when you construct the UI, and not at runtime. Lets go spelunking as to why.


Sugar time

Enjoy the high-fructose.

Overriding gradio control constructors is easy with a decorator, hence this experiment with gradio to create an entire overhall on gradio syntax by putting it behind decorators: https://gist.github.com/loopyd/dfbb31c3fc406f4c429df40e66cfb737

Doing it the sugary way makes it easier to construct a control with a decorator. By which you can simply write your own, for a custom width/height dropdown with conditional menu items.

Also, because you have control of the constructors with a decorator, you can do looney things like what I did, and inject argsparse functionality into the constructor decorator factory to run experiment code as a script instead of making your UI file one gigantic blob of duplicate code. :) (thus, image2image.py, text2image.py, uwu.py, etc.) If you want to pass anything back to gradio, you'll need to set a flag within your modules, that outputs dict blob payloads with progress updates n' status and such on stdout, where your UI is watching the subprocess thread. So you can do that, to make the scripts you can run from the command line output dict blobs instead, so that a databound UI element can react to them.

You can see where adding custom UI functionality becomes better with decorators, can modularize code, but also introduces a ton of boilerplate and additional complexity because the SD world needs a UI library.

Gradio was built with simple experiments in mind, not the massive chop-shopping operation we've been doing to glue pieces of many different ML together on top of SD, and then copy paste and glue all of it together on the UI thread....

Enter the library. Whose purpose is generics, code reflection, and making it work for all these projects, putting each of the ML segments into their own code file, where they can be easily worked on and maintained independent of the UI framework, while still being able to be run via the command line on their own.

Fixing it: Doing it at runtime

Here's where shit gets complicated.

I am working on server threads, as in order to modify interface blocks at runtime in gradio, you need to have it running on a server thread. The static interface will not allow you to inject the event hook to modify the dropdown, but gradio blocks that have been spawned on a serverthread will.

Please keep in mind that dynamically modifying controls will introduce asyncio as a runtime dependency. Here's another breadcrumb, I found this, the hllk fork's got it. I have been chop shopping that into my own test unit. It sort of works, but its been taken me a few days to beautify behind decorators.

Also keep in mind that data-bound controls don't exist in interfaces, because interfaces don't allow events bound to controls, but gradio blocks running on an asyncio server thread do, and so that entire infrastructure has got to be written by somebody to do something like "add x or y to UI when underlying machine is in z state". Data bound controls make a developer have to completely rewrite their UI script. Something that hllk realized and has had to do to make it "user friendly" and reactive.

Current state of all SD repos that use gradio UI on interfaces

Decorators for specific gr_control declarations should get you what you need at least at construction time. You can derive from my boilerplate as a starting point. Its a gist example, not a repo, as its a headache to do. Past construction time, a complete refactor of UI code is required to put things into gradio blocks instead of gradio interfaces running on a server thread.

This is a very complicated feature request. It shouldn't be and there definitely should be people working on a UI framework to make it easier. My PoC demonstrates this, but it doesn't solve it.

Conclusion

Appreciate your deep divers and your library developers, we are trying to make it easier to design these frontends. Currently everything we see is spaghetti, its a mess, and hard to maintain, we bet. A UI library can fix this problem of repeat spaghetti for all existing projects.

Thus the feature request specified here is probably an unnecessarily complicated one, depending on your roadmap and if this should become a feature. It is in my gradio library to do it at construction time as my experiments are a deep dive to extend gradio itself to create better UI and pythonically modalized the workflow to save developers the spaghetti headache, but probably not here at all. Gradio definitely wasn't made to do what we've been doing to it.

I've only provided a PoC, under DBAD, as I don't care for the amount of time this is going to take to fix the spaghetti and fix gradio by extending it with a UI framework so that it can accept bigger experiments. Its something that needs to be said about "add x or y feature", that's been lagging down collabs and locking UI threads from all I've seen.

The maintainer's options

Your options are: