Passing control back and forth between the backend and frontend

lakmeer commented 5 months ago

Hi Chris,

Thanks for this repo, it's been extremely helpful. I'm trying to do something particularly twisted, and I hoped you could lend some advice.

I would like to define a node that can execute arbitrary JS on the client and return the output values in a POST, as part of the graph flow. Yes, I realise this is wildly at odds with Comfy's execution model 😅. I have been proceeding by trying to pick apart your ImageChooser node, but my inexperience with python is holding me back from being able to understand what's going on here.

So far I have a custom node working, and I'm pretty comfortable with how all those features work. I also have an API client that can define and submit a working prompt graph, retrieve outputs, capture errors, and subscribe to WS events emitted by the server.

I am using

routes = PromptServer.instance.routes
@routes.post('/proxy_reply')
async def proxy_reply(request):
    post = await request.post()
    return web.json_response({"status": "ok"})

and

class ProxyNode:
  # classdefs etc omitted 
  def run(self, id, prompt, in0, in1):
    me = prompt[id]
    PromptServer.instance.send_sync("proxy", {
      "id":  id,
      "node": me['_meta']['title'],
      "in0": in0,
      "in1": in1,
    })
    return (in0, in1)

to send messages back and forth, which is working. At the moment the node just forwards it's input values.

After that I get lost 😅 How does ImageChooser pause and resume the graph execution? There's a lot going on here - it looks like you interrupt the whole prompt and then somehow hack the prompt graph so that the next pass runs from the right node. Would you be able to sketch out how you settled on this strategy, and how it works?

Additionally, I wondered if, since my nodes don't have to wait arbitrary time for the user to make a selection, does python have some kind of blocking network request that can just jam the interpreter for a handful of ms while the frontend replies?

Thanks :)

lakmeer commented 5 months ago

Ok I worked it out, haha. Isn't it always the way, as soon as you're done explaining it, a light bulb goes on?

Everything was much clearer after removing all the image-specific stuff and stripping it back. MessageHolder does all the work, I didn't realise python could work that way. I will submit a PR for the docs that breaks it down in case anyone else is interested in a minimal example.

chrisgoringe commented 5 months ago

Glad you got there - I was just thinking about the best way to explain it!

Yeah, once you take the image stuff out, the messaging is pretty simple.

lakmeer commented 5 months ago

Thanks Chris, I've just submitted some docs that explain it how I would have needed it explained to me, hope it helps.

My next issue is how to change the return types dynamically haha, I can't find any info on that - is it even possible?

chrisgoringe commented 5 months ago

Python return type? Or the comfy node output?

lakmeer commented 5 months ago

The comfy node. I am hoping to create a 'generic' ProxyNode that can support a variety of behaviours on the front end, so that I don't have to specify many different types.

For example, say I want function on the front-end that intercepts your text prompts and replaces some strings, and I also want a function that rounds your image dimensions to the nearest power of two. I would need a different ProxyNode for each so that one can return (STRING) and the other (INT, INT). For inputs I can use ANY, but I don't think I can have flexible RETURN_TYPES this way since it's a class attribute.

So far my other options are:

Specify a proxy for every function like PromptSwapperProxy and ImageDimensionsToNearestPO2Proxy.
Specify a proxy for different combinations of inputs, like StringToStringProxy and IntIntToIntIntProxy
Use metaprogramming to generate every permutation of inputs and outputs as it's own class. This seems horrible (and it is 😂), but my context is that the user of the front-end won't be looking at the Comfy LiteGraph interface at all, so their context menu won't drown in thousands of auto-generated ProxyNodes. Comfy startup time might be awful tho, I haven't tested this lol.

I am curious how you would approach this problem.

chrisgoringe commented 5 months ago

metaprogramming isn't so bad, I've done a few factory classes in that way - Python is pretty good at it.

But you can definitely dynamically modify the outputs of a node. Primitive nodes do it (see extensions/core/widgetInputs.js around line 600), and it looks like it's just this.outputs[n].type = "INT" (or whatever).

If you work it out, I'd love to see the code! And happy to see if I can help if you get partway :)

lakmeer commented 5 months ago

Oh sweet, thanks for the pointer :) I'll check out primitive nodes. Thanks Chris! Awesome repo. I'll come back to you if it works.

chrisgoringe / Comfy-Custom-Node-How-To

Passing control back and forth between the backend and frontend #17