Acly / krita-ai-diffusion

Streamlined interface for generating images with AI in Krita. Inpaint and outpaint with optional text prompt, no tweaking required.
https://www.interstice.cloud
GNU General Public License v3.0
6.92k stars 339 forks source link

[REQUEST/SUGGESTION] Any workflow output in Krita #153

Closed LuluViBritannia closed 9 months ago

LuluViBritannia commented 11 months ago

Hey! It's me again. Last time I asked for a Clip Skip in the workflow this plugin uses, and a gigachad made it. I'm glad! But of course, after that node, I discovered there is plenty other nodes that we can't use in Krita. And then I realized... if we eventually ask support for every node available in ComfyUI, why not just use Krita as the output?

The dev made a brillant job at finding a great workflow, but one workflow will never be enough for the infinite use cases of AI generation.

So here is my suggestion : give the user the ability to plug its own workflow into Krita. This is how I imagine it working:

It's kind of a recursive system (the custom output comes in Krita plugin, and Krita plugin affects the custom workflow), but overall there is just two components to add: the ability to use any workflow as input, and the ability to affect that workflow with all the stuff already present in Krita plugin.

I don't expect this change to happen within a week, of course. Maybe it's complicated to implement. But it would be perfect for this plugin to become a bridge between a user's workflow and Krita. In fact, I'd say it will be mandatory at some point.

Right now this plugin is excellent for basic generation, and it does have some advanced stuff like ControlNet and Inpainting. But there will always be some even more advanced, custom stuff the developer hasn't implemented.

Muskelmagier commented 11 months ago

Yeah i have a Similar request because i personally Use several Models that use v-prediction mode which normally needs activation.

In ComfiUI this is done via connecting the checkpoint loader to the modelsamplingdiscrete node, selecting vpred, and turning zsnr on. or when using A111 simply by config .yaml file with sharing the same name.

papersplease commented 11 months ago

I second this request. ComfyUI is great for prototyping and automation, but it lacks competent end-user interface to run your workflows. Krita could be the one.

Acly commented 11 months ago

This is something I considered from the beginning, but departed from more and more as the plugin became more complex. I suspect this is not as easy or as useful as you imagine.

There are currently 5 "basic" workflows:

Although you might expect them to be similar, and they certainly share components, they deviate quite significantly - inpaint especially. So first of, to replace workflows consistently you'd have to craft several workflows, not just one.

Those workflows aren't really workflows in the sense of a Comfy graph, they're more like templates for workflows. The majority of nodes is generated dynamically depending on input checkpoint, lora, selected control layers, sampler, image resolution, mask resolution, batch size, live mode etc. Modifying any of these inputs in Krita will not just change parameters in the graph, but generate a different graph.

Connecting the output is easy because there is just one, but even there you have to make sure the resolution matches exactly what is expected, or it won't fit (or crashes). There are corner cases (minimum, non-multiple-of-8) to consider.

Connecting inputs is tricky, because how many there are and how they're used depends on various factors. For inpaint there are between 1-3 images, 1-2 masks, and up to 5 resolutions and offsets that are used for various upscales, downscales and crops. This is just from the fact that pre-computing certain things is easier to do on the plugin side, while other processes need to run as part of the Comfy workflow. Not counting arbitrary number of additional ControlNets etc.

To be fair, for pure txt2img generation, it's probably quite simple. But here I also see the least benefit, since you can generate images in any external tool and drag them into Krita for editing without much hassle.

At this point you could consider carefully designed extension points in the pipeline, where custom node sub-graphs can be substituted in to modify results and do additional computations. But you're moving even further away from directly using your workflow as you designed it in ComfyUI. And such a system is a lot of work to implement and maintain, and places many restrictions on future development of the internal workflows. Or you'd have your custom workflows/extensions break every other update, which would also be frustrating.

Conversely, keep in mind this is open source, the plugin's workflows are all in workflow.py and can be edited. It's not a fancy UI, but also not terribly difficult if you have good knowledge of ComfyUI. I know there is a barrier to messing with code, and it would need some kind of guide or introduction. It just feels like a more sustainable solution where private extensions can develop into features that are accessible to everybody using a painting app - not just SD veterans.

Cloudwalk9 commented 11 months ago

Perhaps an "advanced" or "expert" mode of the plugin where everything except basic image input and output is disabled, and it entirely defers to Comfy?

At the very least, having a sophisticated digital painting application seamlessly inlined as an output sink and input source for ComfyUI, as a glorified inpaint mask (among other things) editor, would go a long way. A node to receive the image in Krita, and a node to output it back to Comfy. Perhaps a little widget on Krita's side to pick which layers are sent as the image and which layers are sent as the mask.

I don't mean to make light of the effort that was even required to make this plugin, but reducing the problem to a matter of Comfy and Krita speaking in a language (raw bitmaps as the case may be) doesn't make this idea seem that insurmountable. It would set a basic foundation.

The process for tweaking generation settings while iterating in an advanced digital painting app is reduced to an alt-tab rather than a copy and paste, and not even if you have Krita on a graphical drawing tablet on the side and ComfyUI on your main monitor.

In the spirit of open source, I, and others, will certainly try hacking at the code and seeing if something maintainable could be done. We're a dedicated bunch and I'm sure we'll figure something out.

...

Tangential: part of me wishes Blender's 2D painting functionality was extended or polished beyond Grease Pencil and texture editing so it could also be used for digital painting, and we just throw it all in a Blender plugin. 4.0 has a fully functional Python 3.10 interpreter with all of its bells and whistles.

On the other hand, and perhaps most forward thinking, I wish we had a standardized pipeline for image editing across various apps in general, just like we have MIDI as well as software like JACK or Pipewire for audio (on Linux at least), with filter effects, VSTs, physical electronic instruments, all placed inline in a (mostly) harmonious network, where the common languages spoken are PCM and MIDI.

Acly commented 11 months ago

Something like that is more realistic. Krita has "File layers" which watch image files, since Comfy loads and stores to files by default it almost works already. Maybe you'd want ComfyUI to watch the files too. A plugin would streamline it a bit.

It could also be implemented here and maybe Checkpoint/Lora and ControlNet placeholders can work without too much effort. Just keep in mind you'd still have to do a LOT of leg work in your actual Comfy workflow.

Cloudwalk9 commented 11 months ago

@Acly We can convert node widgets to inputs. Perhaps the plugin could provide similar functionality to the "Primitive" node, allowing everything to be remapped to node widgets in custom workflows?

LuluViBritannia commented 11 months ago

At the very least, having a sophisticated digital painting application seamlessly inlined as an output sink and input source for ComfyUI

THIS, exactly! I'll add that this plugin is already an output sink and input source anyway. But since we can't choose what's under the hood, we have to either ride that car or go somewhere else. As the dev said, we need to modify the code related to the workflow itself.

I found this online:

https://github.com/pydn/ComfyUI-to-Python-Extension

It converts a workflow into python code. What if we had something like that, that rewrites any workflow into code compatible with the plugin? Even if it has to be done outside Krita, it's no big deal. As long as we can safely "translate" any workflow into a compatible codes that we can drag into the plugin folder, that would go a long way.

@Acly : Do you think you could give us a short tutorial showing how you got from your ComfyUI workflow to the code files compatible with the plugin? I did read workflow.py, and tried to rewrite it, but failed miserably when I tried to add the Clip skip myself, lol. I understand that the plugin isn't a single workflow, of course. You also make a good point about the software working dynamically. But despite this complexity, it still comes down to inputs going through a pipeline to create an output. I am positive that any Comfy workflow can be used instead of the one under the hood.

Acly commented 11 months ago

In ComfyUI activate "Enable Dev mode Options" in settings, then you can click "Save (API Format)" button to get a JSON. Transformation of this JSON to plugin's ComfyWorkflow works like this for example:

...
  "4": {
    "inputs": {
      "ckpt_name": "dreamshaper_8.safetensors"
    },
    "class_type": "CheckpointLoaderSimple"
  },
  "5": {
    "inputs": {
      "width": 512,
      "height": 512,
      "batch_size": 1
    },
    "class_type": "EmptyLatentImage"
  },
  "6": {
    "inputs": {
      "text": "beautiful scenery nature glass bottle landscape, , purple galaxy bottle,",
      "clip": [
        "4",
        1
      ]
    },
    "class_type": "CLIPTextEncode"
  },
...

becomes

w = ComfyWorkflow()
# ...
model, clip, vae = w.add("CheckpointLoaderSimple", 3, ckpt_name="dreamshaper_8.safetensors")
latent = w.add("EmptyLatentImage", 1, width=512, height=512, batch_size=1)
positive = w.add("CLIPTextEncode", 1, clip=clip, text="beautiful scenery nature glass bottle landscape, , purple galaxy bottle,")
# ...

Every node becomes a function call. This can be automated. But I use hand-crafted code with some helper functions to make it concise, readable and easier to maintain.

Grant-CP commented 10 months ago

Here's my imagining of a way this might work. The user sets up a workflow in Comfyui that looks a bit like this: image

In this workflow, there are a bunch of load Image (base64) nodes, which the entry points. During execution of a custom workflow via this extension each input of a load_Image_base64 node is replace by the image contents of the layer with the same name. It would be up to the user to make sure the names match up.

It would be up to the user to handle different image sizes and the like in their workflow, or to just set up their krita workspace correctly to feed in the correct information.

The user would store their workflows in json files like so: image And krita would just be responsible for fetching the workflow, encoding the images, doing a find/replace on the contents of base64 image nodes, sending the workflow over, then consuming the result.

I think a first pass for custom input here would only allow a single layer per base64Image node, but further implementation would allow the user to specify layer groups instead and have the content of the whole layer group be flattened as input. I'm taken some inspiration from this other Krita plugin which does this for the purpose of saving a bunch of files: https://github.com/GDQuest/krita-batch-exporter. Nice youtube explanation of the features is here (9 min) https://www.youtube.com/watch?v=jJE5iqE8Q7c . It also has some other nice features like the ability to label particular layers for compressed jpeg export vs png vs webp etc. I've dug into their code a bit, but I don't know enough about Q-stuff in python to guess how hard the other logic would be to take.

I've tried over the last couple of days to get into the source code of your api calls to see if I could test this out myself, but I personally had some trouble. In particular, I was struggling to end up in a place where I neatly had all of the images to feed into a workflow find/replace function. I feel like the scope of this idea is rather simple and small, and I'm happy to write code for it if you can point me in the right direction.

lrq3000 commented 10 months ago

Apparently this other ComfyUI based SD plugin for krita supports custom workflows https://github.com/JasonS09/comfy_sd_krita_plugin

Maybe both projects teams could discuss if a merge is possible to avoid reinventing the wheel?

Grant-CP commented 10 months ago

That project does support custom workflows kind of, but there are a few issues. The first is that their custom workflows still depend on very specific nodes being places at specific points. The second is that they have entrypoints from Krita only for the current mask and the entire current composition (along with the prompt and a few other parameters).

Acly’s extension already has support for putting separate layers into separate base64 image decode nodes, which is pretty crucial if you want a custom workflow that involves any attention masking or controlnets or anything like that.

I think the questions around custom workflows are little more design oriented for how they fit into this extension specifically (easy to use, intuitive, best practices don’t have to be input manually). For example, the photoshop version also support custom workflows, and it accomplished that by basically recreating the entire json of the workflow in a giant set of dropdown menus within the docker for the comfyui extension. That solution works, but leaves a lot to be desired.

I appreciate you linking the repo! Maybe you’ll be interested in providing feedback once I finalize my proposal for how custom workflows will live in this extension.

-Grant

On Jan 9, 2024, at 10:45 PM, Stephen Karl Larroque @.***> wrote:

Apparently this other ComfyUI based SD plugin for krita supports custom workflows https://github.com/JasonS09/comfy_sd_krita_plugin

Maybe both projects teams could discuss if a merge is possible to avoid reinventing the wheel?

— Reply to this email directly, view it on GitHub https://github.com/Acly/krita-ai-diffusion/issues/153#issuecomment-1884279648, or unsubscribe https://github.com/notifications/unsubscribe-auth/ATSQ2YTNCNBBRY3KZF2Q7XDYNY2INAVCNFSM6AAAAAA77HHCH6VHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMYTQOBUGI3TSNRUHA. You are receiving this because you commented.

Acly commented 10 months ago

A fairly flexible option would be to mark nodes in the workflow (via name or some extension) which become "parameters" in the plugin UI. I think there are web-based comfy frontends which work a bit like that.

For starters if you want to do an initial test you could reuse one of the existing workflow entry points. Let's say you do 100% generate without selection in the UI. This will end up in workflow.generate (workflow.py), which receives all the parameters as input and returns a comfy prompt. You can load a prompt json there and return it as the root of a ComfyWorkflow object, and it should execute your json. Then you can try to modify it based on the inputs (eg. Conditioning holds images of all the control layers).

Grant-CP commented 10 months ago

Thanks for the help. I was struggling a bit to find exactly where I should attack the task of injecting a custom workflow. When you are working on changes to the Krita extension do you just put in a bunch of log statements? Is there a pyKrita extension terminal window somewhere?

As for UI within Krita, how hard is it to generate Docker UI dynamically? For instance if I arrived at a place in the python code where I had a dictionary of {“hi_res_scale”:1.8, “masked_conditioning_strength_1”:1.00} of parameters that I wanted to tweak, but those parameters were only surfaced after I selected a specific workflow.json file, is it reasonable to create within the docker two input fields, one for each parameter?

For the UI part, I’m asking just reasonableness for now, as opposed to exactly how. I’m asking mostly because I feel like some changes to the docker/extension that I’ve tried so far seem to want me to restart Krita.

On Jan 11, 2024, at 2:45 AM, Acly @.***> wrote:

A fairly flexible option would be to mark nodes in the workflow (via name or some extension) which become "parameters" in the plugin UI. I think there are web-based comfy frontends which work a bit like that.

For starters if you want to do an initial test you could reuse one of the existing workflow entry points. Let's say you do 100% generate without selection in the UI. This will end up in workflow.generate (workflow.py), which receives all the parameters as input and returns a comfy prompt. You can load a prompt json there and return it as the root of a ComfyWorkflow object, and it should execute your json. Then you can try to modify it based on the inputs (eg. Conditioning holds images of all the control layers).

— Reply to this email directly, view it on GitHub https://github.com/Acly/krita-ai-diffusion/issues/153#issuecomment-1886836053, or unsubscribe https://github.com/notifications/unsubscribe-auth/ATSQ2YWTAFTVWXYCSAPVTBLYN67ELAVCNFSM6AAAAAA77HHCH6VHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMYTQOBWHAZTMMBVGM. You are receiving this because you commented.

Acly commented 10 months ago

Is there a pyKrita extension terminal window somewhere?

You can start Krita from console and just use python print (on windows use krita.com, not krita.exe)

Generating UI for workflows is no problem, just work.

Phen-Ro commented 9 months ago

Would it be possible to making something like a "from Krita for inpainting" node that is loaded with all the relevant parameters as outputs that this plugin sends in, and then a "send this back to Krita" node to get the image back? Then users with the custom workflows would have the responsibility of connecting them as appropriate. There could be many from-Krita nodes, one for each variation that the plugin needs.

Or are those from-Krita parameters too dynamic?

Acly commented 9 months ago

It's possible, but you will have to deal with arrays of structured data (arbitrary number of lora, control net, ip-adapter). At least base ComfyUI isn't well equipped to handle that.