AnonymousCervine / depth-image-io-for-SDWebui

An extension to allow managing custom depth inputs to Stable Diffusion depth2img models for the stable-diffusion-webui repo.
72 stars 6 forks source link

Api support possible? #8

Open arjanMeerten opened 1 year ago

arjanMeerten commented 1 year ago

Is it possible to use this scrip with the api? The api does have script support, but does this script needs to be compatible? If not what arguments do I use in the "script_args" to make it work?

AnonymousCervine commented 1 year ago

I honestly have no idea!

However, looking around, I can say a few things. First, although you may have already, take a look at this snippet from a certain python api client's README

The code references the run() function of the X/Y plot script (which nowadays is XYZ I think, but that's besides the point). The arguments fed to run() in a Script subclass (including the one in this repo) are based off of the Gradio UI elements it returns from its ui() function.

Without being too precise, we can reasonably assume that SD bases the API off of the Gradio components in question, delivering them in the same fixed order as a script's ui() and run() functions. (Also, I've looked at the X/Y plot script before and it doesn't have explicit API handling, supporting this tentative conclusion). So this extension likely works fine with the API, with an interface auto-generated from the UI definition. It looks like it's probably order-based (as in, script_args is an array whose elements are treated as arguments, in-order)

Incidentally, run() in the current commit of this repo looks like:

def run(self, p, input_depth_img, show_depth, batch_img_input, batch_depth_input, batch_many_to_many):

You can safely ignore self (obviously) and p (which is the data-structure holding the rest of the render request info) here, the arguments the API would require/recognize start from the third. What remains is an image, a boolean value, and two lists of files. If the implementation is decent, it'll likely accept one (but not necessarily both) of: A) None/null, or B) an empty array, for the image and batches in order to indicate they're not specified (and so for instance in the python api client I'd most likely expect you to need to pass an empty array for the batches when not using them).

A warning at this point: It is very possible, even very likely, that this signature will change in the (near-ish) future. There are already identified issues with the current structure of things. And I'm not at a level of comfort that I'm going to make promises about supporting backwards-compatibility for the API at this time (though by the same token—let me know if you do make something dependent on the API for this, so I can know that someone is using it and take that into consideration as to whether supporting backwards-compatibility is worth it).

Anyway, moving on.

Next, there's the question of what data-types are used for communicating to the API. Floats, ints, and strings are likely straightforward; images may or may not be. If in doubt, look to an img2img API example to try to figure out what an acceptable format for transmitting images is; for the python client linked above, it looks like it probably takes PIL images.

(And if you want to use the file-batching part of this plugin you'll likely have to dig a couple inches deeper. Although I'm not sure why you'd want to use batching from the plugin when you already have programmatic access through the API.)

Finally: As a reminder I haven't used the API before. Much of the above is highly-educated guesses from glancing at guides and source-code; YMMV.

Good luck with whatever you're doing!

sphuff commented 1 year ago

@arjanMeerten you can do it with the following call to /sdapi/v1/txt2img:

        "prompt": "",
    "script_name": "Custom Depth Images (input/output)",
    "script_args": ["YOUR_BASE64_IMAGE", false, null, null, null]

The args correspond with the signature @AnonymousCervine mentioned

arjanMeerten commented 1 year ago

Thanks! I will try it out