Open MoonMoon82 opened 1 year ago
@BlenderNeko Maybe you have an idea how this "reference_only" preprocessor could work in comfyui ?
Reference only is way more involved as it is technically not a controlnet, and would require changes to the unet code. There has been some talk and thought about implementing it in comfy, but so far the consensus was to at least wait a bit for the reference_only implementation in the cnet repo to stabilize, or have some source that clearly explains why and what they are doing.
It's likely that we'd see an implementation of this before any kind of reference only support, simply because of ease of implementation. Perhaps that could in part fill a similar role.
+1 for the request to have controlnet reference.
I was trying to make emotion cards for silly tavern, in comfy ui that should have been a doddle setting up a workflow so it would generate an image then use that image to create 26 more each with a different emotion but the same person. with out reference only though thats just not do-able.
Guess it's not going to be implemented. https://desuarchive.org/g/thread/94223958/#94225957
Here's a simple node for it, if it works fine I'll put it somewhere more visible, download and save that reference_only.py to your custom_nodes folder: https://gist.github.com/comfyanonymous/343e5675f9a2c8281fde0c440df2e2c6
Copy and ctrl-v this to the UI for the workflow:
{
"last_node_id": 15,
"last_link_id": 37,
"nodes": [
{
"id": 8,
"type": "VAEDecode",
"pos": [
1209,
188
],
"size": {
"0": 210,
"1": 46
},
"flags": {},
"order": 8,
"mode": 0,
"inputs": [
{
"name": "samples",
"type": "LATENT",
"link": 7
},
{
"name": "vae",
"type": "VAE",
"link": 8
}
],
"outputs": [
{
"name": "IMAGE",
"type": "IMAGE",
"links": [
9
],
"slot_index": 0
}
],
"properties": {
"Node name for S&R": "VAEDecode"
}
},
{
"id": 6,
"type": "CLIPTextEncode",
"pos": [
233,
117
],
"size": {
"0": 422.84503173828125,
"1": 164.31304931640625
},
"flags": {},
"order": 2,
"mode": 0,
"inputs": [
{
"name": "clip",
"type": "CLIP",
"link": 3
}
],
"outputs": [
{
"name": "CONDITIONING",
"type": "CONDITIONING",
"links": [
4
],
"slot_index": 0
}
],
"properties": {
"Node name for S&R": "CLIPTextEncode"
},
"widgets_values": [
"crude drawing of girl"
]
},
{
"id": 7,
"type": "CLIPTextEncode",
"pos": [
237,
370
],
"size": {
"0": 425.27801513671875,
"1": 180.6060791015625
},
"flags": {},
"order": 3,
"mode": 0,
"inputs": [
{
"name": "clip",
"type": "CLIP",
"link": 5
}
],
"outputs": [
{
"name": "CONDITIONING",
"type": "CONDITIONING",
"links": [
6
],
"slot_index": 0
}
],
"properties": {
"Node name for S&R": "CLIPTextEncode"
},
"widgets_values": [
"text, watermark"
]
},
{
"id": 3,
"type": "KSampler",
"pos": [
863,
186
],
"size": {
"0": 315,
"1": 262
},
"flags": {},
"order": 7,
"mode": 0,
"inputs": [
{
"name": "model",
"type": "MODEL",
"link": 37
},
{
"name": "positive",
"type": "CONDITIONING",
"link": 4
},
{
"name": "negative",
"type": "CONDITIONING",
"link": 6
},
{
"name": "latent_image",
"type": "LATENT",
"link": 34
}
],
"outputs": [
{
"name": "LATENT",
"type": "LATENT",
"links": [
7
],
"slot_index": 0
}
],
"properties": {
"Node name for S&R": "KSampler"
},
"widgets_values": [
719286772344905,
"fixed",
20,
8,
"euler",
"normal",
1
]
},
{
"id": 9,
"type": "SaveImage",
"pos": [
1548,
180
],
"size": [
1454.6668601568254,
548.2885143635223
],
"flags": {},
"order": 9,
"mode": 0,
"inputs": [
{
"name": "images",
"type": "IMAGE",
"link": 9
}
],
"properties": {},
"widgets_values": [
"refer/ComfyUI"
]
},
{
"id": 4,
"type": "CheckpointLoaderSimple",
"pos": [
-563,
510
],
"size": {
"0": 315,
"1": 98
},
"flags": {},
"order": 0,
"mode": 0,
"outputs": [
{
"name": "MODEL",
"type": "MODEL",
"links": [
32
],
"slot_index": 0
},
{
"name": "CLIP",
"type": "CLIP",
"links": [
3,
5
],
"slot_index": 1
},
{
"name": "VAE",
"type": "VAE",
"links": [
8,
20
],
"slot_index": 2
}
],
"properties": {
"Node name for S&R": "CheckpointLoaderSimple"
},
"widgets_values": [
"sd_xl_1.0.safetensors"
]
},
{
"id": 14,
"type": "ImageScale",
"pos": [
-129,
763
],
"size": {
"0": 315,
"1": 130
},
"flags": {},
"order": 4,
"mode": 0,
"inputs": [
{
"name": "image",
"type": "IMAGE",
"link": 19
}
],
"outputs": [
{
"name": "IMAGE",
"type": "IMAGE",
"links": [
18
],
"shape": 3,
"slot_index": 0
}
],
"properties": {
"Node name for S&R": "ImageScale"
},
"widgets_values": [
"nearest-exact",
768,
768,
"center"
]
},
{
"id": 13,
"type": "LoadImage",
"pos": [
-483,
777
],
"size": {
"0": 315,
"1": 314
},
"flags": {},
"order": 1,
"mode": 0,
"outputs": [
{
"name": "IMAGE",
"type": "IMAGE",
"links": [
19
],
"shape": 3,
"slot_index": 0
},
{
"name": "MASK",
"type": "MASK",
"links": null,
"shape": 3
}
],
"properties": {
"Node name for S&R": "LoadImage"
},
"widgets_values": [
"example.png",
"image"
]
},
{
"id": 15,
"type": "ReferenceOnlySimple",
"pos": [
515,
675
],
"size": {
"0": 315,
"1": 78
},
"flags": {},
"order": 6,
"mode": 0,
"inputs": [
{
"name": "model",
"type": "MODEL",
"link": 32,
"slot_index": 0
},
{
"name": "reference",
"type": "LATENT",
"link": 35
}
],
"outputs": [
{
"name": "MODEL",
"type": "MODEL",
"links": [
37
],
"shape": 3,
"slot_index": 0
},
{
"name": "LATENT",
"type": "LATENT",
"links": [
34
],
"shape": 3,
"slot_index": 1
}
],
"properties": {
"Node name for S&R": "ReferenceOnlySimple"
},
"widgets_values": [
2
]
},
{
"id": 12,
"type": "VAEEncode",
"pos": [
248,
732
],
"size": {
"0": 210,
"1": 46
},
"flags": {},
"order": 5,
"mode": 0,
"inputs": [
{
"name": "pixels",
"type": "IMAGE",
"link": 18,
"slot_index": 0
},
{
"name": "vae",
"type": "VAE",
"link": 20,
"slot_index": 1
}
],
"outputs": [
{
"name": "LATENT",
"type": "LATENT",
"links": [
35
],
"shape": 3,
"slot_index": 0
}
],
"properties": {
"Node name for S&R": "VAEEncode"
}
}
],
"links": [
[
3,
4,
1,
6,
0,
"CLIP"
],
[
4,
6,
0,
3,
1,
"CONDITIONING"
],
[
5,
4,
1,
7,
0,
"CLIP"
],
[
6,
7,
0,
3,
2,
"CONDITIONING"
],
[
7,
3,
0,
8,
0,
"LATENT"
],
[
8,
4,
2,
8,
1,
"VAE"
],
[
9,
8,
0,
9,
0,
"IMAGE"
],
[
18,
14,
0,
12,
0,
"IMAGE"
],
[
19,
13,
0,
14,
0,
"IMAGE"
],
[
20,
4,
2,
12,
1,
"VAE"
],
[
32,
4,
0,
15,
0,
"MODEL"
],
[
34,
15,
1,
3,
3,
"LATENT"
],
[
35,
12,
0,
15,
1,
"LATENT"
],
[
37,
15,
0,
3,
0,
"MODEL"
]
],
"groups": [],
"config": {},
"extra": {},
"version": 0.4
}
I put it in this repo: https://github.com/comfyanonymous/ComfyUI_experiments
Hello, I'm trying to install reference_only but I get this error:
"title":"ComfyUI_experiments/reference_only.py at master · comfyanonymous/ComfyUI_experiments","locale":"en"} NameError: name 'true' is not defined
Cannot import C:\Users\GABSU\comfyui\ComfyUI\custom_nodes\reference_only.py module for custom nodes: name 'true' is not defined
Have any idea why this is happening? I've added the reference_only.py to custom_nodes folder. Thank you so much for your work :)
Hey @comfyanonymous the installation workflow work well for me but the result are pretty bad. Could you share an workflow example that work well for you ? I'm so far from the result show in the video
Wow! It works perfect for me. It is possible use two reference input? To process video frames, for example..
@addddd2 I opened a feature request some weeks ago to get something like img2img with reference_only node: https://github.com/comfyanonymous/ComfyUI_experiments/issues/5
I already tried it on my own, but I guess this kind of img2img does not work that way.. The results looked more like the input image than the reference image. Maybe @comfyanonymous could say something more about it...
Sorry for this code. Did the best I could. I hope the author does the right thing.
It receives as input two reference and one intended for img2img
--- reference_only.py 2023-07-26 22:24:24.000000000 +0300
+++ reference_only3.py 2023-08-25 00:51:27.233217800 +0300
@@ -1,10 +1,12 @@
import torch
-class ReferenceOnlySimple:
+class ReferenceOnlySimple3:
@classmethod
def INPUT_TYPES(s):
return {"required": { "model": ("MODEL",),
"reference": ("LATENT",),
+ "reference2": ("LATENT",),
+ "input": ("LATENT",),
"batch_size": ("INT", {"default": 1, "min": 1, "max": 64})
}}
@@ -13,28 +15,31 @@
CATEGORY = "custom_node_experiments"
- def reference_only(self, model, reference, batch_size):
+ def reference_only(self, model, reference, reference2, input, batch_size):
model_reference = model.clone()
size_latent = list(reference["samples"].shape)
size_latent[0] = batch_size
- latent = {}
- latent["samples"] = torch.zeros(size_latent)
+ latent = input
- batch = latent["samples"].shape[0] + reference["samples"].shape[0]
+ batch = latent["samples"].shape[0] + reference["samples"].shape[0] + reference2["samples"].shape[0]
+
+
def reference_apply(q, k, v, extra_options):
k = k.clone().repeat(1, 2, 1)
offset = 0
if q.shape[0] > batch:
offset = batch
+
+ re = extra_options["transformer_index"] % 2
for o in range(0, q.shape[0], batch):
for x in range(1, batch):
- k[x + o, q.shape[1]:] = q[o,:]
+ k[x + o, q.shape[1]:] = q[o + re,:]
return q, k, k
model_reference.set_model_attn1_patch(reference_apply)
- out_latent = torch.cat((reference["samples"], latent["samples"]))
+ out_latent = torch.cat((reference["samples"], reference2["samples"], latent["samples"]))
if "noise_mask" in latent:
mask = latent["noise_mask"]
else:
@@ -47,8 +52,8 @@
mask = mask.repeat(latent["samples"].shape[0], 1, 1)
out_mask = torch.zeros((1,mask.shape[1],mask.shape[2]), dtype=torch.float32, device="cpu")
- return (model_reference, {"samples": out_latent, "noise_mask": torch.cat((out_mask, mask))})
+ return (model_reference, {"samples": out_latent, "noise_mask": torch.cat((out_mask,out_mask, mask))})
NODE_CLASS_MAPPINGS = {
- "ReferenceOnlySimple": ReferenceOnlySimple,
+ "ReferenceOnlySimple3": ReferenceOnlySimple3,
}
Can you show your workflow? Somehow it is not working very well for me.
On Thu, Aug 24, 2023, 02:20 addddd2 @.***> wrote:
Wow! It works perfect for me. It is possible use two reference input? To process video frames, for example..
— Reply to this email directly, view it on GitHub https://github.com/comfyanonymous/ComfyUI/issues/661#issuecomment-1690806371, or unsubscribe https://github.com/notifications/unsubscribe-auth/AM7X3GXRFE7POIGFQM7UIETXW2M3NANCNFSM6AAAAAAYDJOJUU . You are receiving this because you are subscribed to this thread.Message ID: @.***>
@ntdviet workflows is here https://github.com/comfyanonymous/ComfyUI_experiments/issues/5
I made a custom node that supports reference only and reference only + adain, and it can also adjust the style strength.
This is a diffusers-based custom node, which is used differently than Comfy's KSampler-based one.
https://github.com/Jannchie/ComfyUI-J https://civitai.com/models/361265/comfyui-j-diffusers-based-pipeline-nodes
I made a custom node that supports reference only and reference only + adain, and it can also adjust the style strength.
This is a diffusers-based custom node, which is used differently than Comfy's KSampler-based one.
https://github.com/Jannchie/ComfyUI-J https://civitai.com/models/361265/comfyui-j-diffusers-based-pipeline-nodes
looks epic, can't wait for lora support
I made a custom node that supports reference only and reference only + adain, and it can also adjust the style strength.我做了一个自定义节点,支持仅引用和仅引用+adain,还可以调整样式强度。
This is a diffusers-based custom node, which is used differently than Comfy's KSampler-based one.这是一个基于扩散器的自定义节点,其使用方式与 Comfy 的基于 KSampler 的节点不同。
https://github.com/Jannchie/ComfyUI-J https://civitai.com/models/361265/comfyui-j-diffusers-based-pipeline-nodes
This is the best reference_only for comfyui, wait for sdxl support
There is a new ControlNet feature called "reference_only" which seems to be a preprocessor without any controlnet model. Please add this feature to the controlnet nodes.
Kind regards
https://www.youtube.com/watch?v=tBwmbTwMxfQ