Chaoses-Ib / ComfyScript

A Python frontend and library for ComfyUI
https://discord.gg/arqJbtEg7w
MIT License
431 stars 24 forks source link

How to get value from the output of a node and perform math or other operations #29

Open ambocclusion opened 9 months ago

ambocclusion commented 9 months ago

Hello! I have this code here

async def extend_audio(params: AudioWorkflow):
    async with Workflow(wait=True) as wf:
        model, model_sr = MusicgenLoader()
        audio, sr, duration = LoadAudio(params.snd_filename)
        audio = ConvertAudio(audio, sr, model_sr, 1)
        audio = ClipAudio(audio, duration - 10.0, duration, model_sr) # I would like to perform this math
        raw_audio = MusicgenGenerate(model, audio, 4, duration + params.duration, params.cfg, params.top_k, params.top_p, params.temperature, params.seed or random.randint(0, 2**32 - 1))
        audio = ClipAudio(audio, 0.0, duration - 10.0, model_sr)
        audio = ConcatAudio(audio, raw_audio)
        spectrogram_image = SpectrogramImage(audio, 1024, 256, 1024, 0.4)
        spectrogram_image = ImageResize(spectrogram_image, ImageResize.mode.resize, True, ImageResize.resampling.lanczos, 2, 512, 128)
        video = CombineImageWithAudio(spectrogram_image, audio, model_sr, CombineImageWithAudio.file_format.webm, "final_output")
        await wf.queue()._wait()
    results = await video._wait()
    return await get_data(results)

And I would like to perform math on the resulting duration from LoadAudio. If I try to use it raw, it will just throw the error TypeError: unsupported operand type(s) for -: 'Float' and 'float' Is it possible to get the result here?

Chaoses-Ib commented 9 months ago

Currently you cannot do this directly. This is the main limitation of virtual mode:

The main limitation of virtual mode is that one cannot get the output of nodes from Python before running the full workflow. However, this limitation can be mitigated by expanding a workflow dynamically and run it multiple times. See select and process for an example. (If https://github.com/comfyanonymous/ComfyUI/pull/2666 is someday merged into ComfyUI, this limitation can be solved.)

There are several workarounds:

Chaoses-Ib commented 9 months ago

By the way, _wait() is just for internal use, you can await video or await wf.queue() without calling _wait().

And wf.queue() is not necessary since Workflow will queue the workflow when exiting the context. You should either set Workflow(queue=False) or avoid calling wf.queue() manually.

ambocclusion commented 9 months ago

Thanks! Yeah I kinda got desperate to get anything from the workflow and just threw everything at it.

Chaoses-Ib commented 9 months ago

I haven't written much about node results in the docs because there are many refactors not done yet. There will be some built-in interop support for basic types in the future, so that one can await image or await duration without manually calling output nodes like SaveImage/PreviewImage and ShowAnyToJSONCrystools. Probably also await image.to_pil() for easy type conversion. This also helps to unify virtual mode and real mode.

And Result._output may be removed in the future. Result may be replaced with a wrapped dict, so that one can write result['ui']['text']/result['text'] instead of result._output['text'].

ambocclusion commented 9 months ago

Thinking about this some, I wonder if some simple built-in eval utility nodes would be too out of scope for ComfyScript? I may look into writing a solution for this I just wonder if there was any chance you saw this as a possible enhancement already.

Chaoses-Ib commented 9 months ago

Being able to mix Python code with nodes is one of the goals of ComfyScript. If ComfyUI is running in the same process as ComfyScript, this goal can be archived by directly executing nodes as normal Python functions, which is what real mode does. If ComfyUI is running remotely, either node value or user code needs to be passed around the client and the server.

Passing value around is more flexible in that the client can do anything to the value, like calling external libraries and showing GUIs to let the user input the new value. However, it has the disadvantage that passing big values (e.g. models) would be slow. Passing code is more limited than passing values, but can be faster in this case. So solutions of both ways are useful in certain cases and would both be added in the future.

The most common "pass code around" solution is using eval nodes. There are already many eval nodes made by the community, but most of them are bloated, including many other nodes unrelated to eval and not very useful with ComfyScript. So I didn't select one as built-in. If you are going to make one that isn't bloated, I'm very willing to add it as built-in.

And the "pass value around" solution I'm going to implement is to add a node to apply callbacks to values. For example:

audio, sr, duration = LoadAudio(params.snd_filename)
audio = ConvertAudio(audio, sr, model_sr, 1)
audio = ClipAudio(audio, Apply(duration, lambda x: x - 10.0), duration, model_sr)
# or
audio = ClipAudio(audio, duration.apply(lambda x: x - 10.0), duration, model_sr)

The only requirement of Apply is the value must support interop, i.e. can be sent to the client and then back to the server. Or, if ComfyUI is running in the same process, it's also possible to capture the variable directly.

Also note that currently none of these solutions can change workflow structures at runtime, i.e. no real control flows in the workflow (unless do some hacks to ComfyUI). As mentioned above, I'm waiting for https://github.com/comfyanonymous/ComfyUI/pull/2666 to be merged. It allows a workflow to be dynamic, i.e. nodes can return sub-workflows to be executed at runtime. This allows the client to change the workflow structure at runtime, and doesn't have the inefficiency problem of the "run multiple times" solution.