ltdrdata / ComfyUI-Impact-Pack

Custom nodes pack for ComfyUI This custom node helps to conveniently enhance images through Detector, Detailer, Upscaler, Pipe, and more.
GNU General Public License v3.0
1.89k stars 183 forks source link

Trying to run face detailer on MacOS results in assertion failure: total bytes of NDArray > 2**32' #376

Closed kevingre closed 9 months ago

kevingre commented 11 months ago

Using the workflow ComfyUI-Impact-Pack/workflow/detailer-for-animatediff.png results in the following error on the Mac if using an increased number of frames (e.g. 50):

In the file core.py on line 70/71: refined_latent = \ nodes.KSampler().sample(model, seed, steps, cfg, sampler_name, scheduler, positive, negative, latent_image, denoise)[0]

sample gets called and results in the following assertion failure in MPS:

/AppleInternal/Library/BuildRoots/d615290d-668b-11ee-9734-0697ca55970a/Library/Caches/com.apple.xbs/Sources/MetalPerformanceShaders/MPSCore/Types/MPSNDArray.mm:761: failed assertion `[MPSNDArray initWithDevice:descriptor:] Error: total bytes of NDArray > 2**32'

Any ideas on a work-around for this? Or the cause?

Thanks in advance!

kevingre commented 11 months ago

Not sure how the "sample" method works, but when I use 64 frames, I don't get the error, but at 50 frames, I do. Haven't confirmed but it may be work with numbers divisible by 8.

ltdrdata commented 11 months ago

Not sure how the "sample" method works, but when I use 64 frames, I don't get the error, but at 50 frames, I do. Haven't confirmed but it may be work with numbers divisible by 8.

Check to see if any issues arise when you put it directly into KSampler without the Detailer. It looks like you're probably using AD, so this seems like an issue that should go to the AD repository.

kevingre commented 11 months ago

It does work without issues when the Detailer flow is disabled and works with any number of frames using just AD and no Detailer or SEGS paste, as long as the Detailer isn't in the workflow. My first guess at it is that it has something to do with the workflow gathering up all the faces from each frame and putting them in one sampler to "detail" in a consistent manner (if I'm understanding how this whole thing works - lol). Maybe something to do with how it generates that image to detail. If I should be closing this and directing it to another repo, please let me know :)

ltdrdata commented 11 months ago

Add debug code like this. And show me the log. (including full stack trace)

        print(f"[DBG1] seed:{seed}, steps:{steps}, cfg:{cfg}, sampler_name:{sampler_name}, scheduler:{scheduler}, denoise:{denoise}")
        print(f"[DBG2] latent_image:{latent_image.shape}")
        refined_latent = \ nodes.KSampler().sample(model, seed, steps, cfg, sampler_name, scheduler, positive, negative, latent_image, denoise)[0]
kevingre commented 11 months ago

Here's the result of the run. Looks like an error throwing trying to generate the second debug line:

got prompt model_type EPS adm 0 Using split attention in VAE Working with z of shape (1, 4, 32, 32) = 4096 dimensions. Using split attention in VAE missing {'cond_stage_model.clip_l.logit_scale', 'cond_stage_model.clip_l.text_projection'} left over keys: dict_keys(['embedding_manager.embedder.transformer.text_model.embeddings.position_embedding.weight', 'embedding_manager.embedder.transformer.text_model.embeddings.position_ids', 'embedding_manager.embedder.transformer.text_model.embeddings.token_embedding.weight', 'embedding_manager.embedder.transformer.text_model.encoder.layers.0.layer_norm1.bias', 'embedding_manager.embedder.transformer.text_model.encoder.layers.0.layer_norm1.weight', 'embedding_manager.embedder.transformer.text_model.encoder.layers.0.layer_norm2.bias', 'embedding_manager.embedder.transformer.text_model.encoder.layers.0.layer_norm2.weight', 'embedding_manager.embedder.transformer.text_model.encoder.layers.0.mlp.fc1.bias', 'embedding_manager.embedder.transformer.text_model.encoder.layers.0.mlp.fc1.weight', 'embedding_manager.embedder.transformer.text_model.encoder.layers.0.mlp.fc2.bias', 'embedding_manager.embedder.transformer.text_model.encoder.layers.0.mlp.fc2.weight', 'embedding_manager.embedder.transformer.text_model.encoder.layers.0.self_attn.k_proj.bias', 'embedding_manager.embedder.transformer.text_model.encoder.layers.0.self_attn.k_proj.weight', 'embedding_manager.embedder.transformer.text_model.encoder.layers.0.self_attn.out_proj.bias', 'embedding_manager.embedder.transformer.text_model.encoder.layers.0.self_attn.out_proj.weight', 'embedding_manager.embedder.transformer.text_model.encoder.layers.0.self_attn.q_proj.bias', 'embedding_manager.embedder.transformer.text_model.encoder.layers.0.self_attn.q_proj.weight', 'embedding_manager.embedder.transformer.text_model.encoder.layers.0.self_attn.v_proj.bias', 'embedding_manager.embedder.transformer.text_model.encoder.layers.0.self_attn.v_proj.weight', 'embedding_manager.embedder.transformer.text_model.encoder.layers.1.layer_norm1.bias', 'embedding_manager.embedder.transformer.text_model.encoder.layers.1.layer_norm1.weight', 'embedding_manager.embedder.transformer.text_model.encoder.layers.1.layer_norm2.bias', 'embedding_manager.embedder.transformer.text_model.encoder.layers.1.layer_norm2.weight', 'embedding_manager.embedder.transformer.text_model.encoder.layers.1.mlp.fc1.bias', 'embedding_manager.embedder.transformer.text_model.encoder.layers.1.mlp.fc1.weight', 'embedding_manager.embedder.transformer.text_model.encoder.layers.1.mlp.fc2.bias', 'embedding_manager.embedder.transformer.text_model.encoder.layers.1.mlp.fc2.weight', 'embedding_manager.embedder.transformer.text_model.encoder.layers.1.self_attn.k_proj.bias', 'embedding_manager.embedder.transformer.text_model.encoder.layers.1.self_attn.k_proj.weight', 'embedding_manager.embedder.transformer.text_model.encoder.layers.1.self_attn.out_proj.bias', 'embedding_manager.embedder.transformer.text_model.encoder.layers.1.self_attn.out_proj.weight', 'embedding_manager.embedder.transformer.text_model.encoder.layers.1.self_attn.q_proj.bias', 'embedding_manager.embedder.transformer.text_model.encoder.layers.1.self_attn.q_proj.weight', 'embedding_manager.embedder.transformer.text_model.encoder.layers.1.self_attn.v_proj.bias', 'embedding_manager.embedder.transformer.text_model.encoder.layers.1.self_attn.v_proj.weight', 'embedding_manager.embedder.transformer.text_model.encoder.layers.10.layer_norm1.bias', 'embedding_manager.embedder.transformer.text_model.encoder.layers.10.layer_norm1.weight', 'embedding_manager.embedder.transformer.text_model.encoder.layers.10.layer_norm2.bias', 'embedding_manager.embedder.transformer.text_model.encoder.layers.10.layer_norm2.weight', 'embedding_manager.embedder.transformer.text_model.encoder.layers.10.mlp.fc1.bias', 'embedding_manager.embedder.transformer.text_model.encoder.layers.10.mlp.fc1.weight', 'embedding_manager.embedder.transformer.text_model.encoder.layers.10.mlp.fc2.bias', 'embedding_manager.embedder.transformer.text_model.encoder.layers.10.mlp.fc2.weight', 'embedding_manager.embedder.transformer.text_model.encoder.layers.10.self_attn.k_proj.bias', 'embedding_manager.embedder.transformer.text_model.encoder.layers.10.self_attn.k_proj.weight', 'embedding_manager.embedder.transformer.text_model.encoder.layers.10.self_attn.out_proj.bias', 'embedding_manager.embedder.transformer.text_model.encoder.layers.10.self_attn.out_proj.weight', 'embedding_manager.embedder.transformer.text_model.encoder.layers.10.self_attn.q_proj.bias', 'embedding_manager.embedder.transformer.text_model.encoder.layers.10.self_attn.q_proj.weight', 'embedding_manager.embedder.transformer.text_model.encoder.layers.10.self_attn.v_proj.bias', 'embedding_manager.embedder.transformer.text_model.encoder.layers.10.self_attn.v_proj.weight', 'embedding_manager.embedder.transformer.text_model.encoder.layers.11.layer_norm1.bias', 'embedding_manager.embedder.transformer.text_model.encoder.layers.11.layer_norm1.weight', 'embedding_manager.embedder.transformer.text_model.encoder.layers.11.layer_norm2.bias', 'embedding_manager.embedder.transformer.text_model.encoder.layers.11.layer_norm2.weight', 'embedding_manager.embedder.transformer.text_model.encoder.layers.11.mlp.fc1.bias', 'embedding_manager.embedder.transformer.text_model.encoder.layers.11.mlp.fc1.weight', 'embedding_manager.embedder.transformer.text_model.encoder.layers.11.mlp.fc2.bias', 'embedding_manager.embedder.transformer.text_model.encoder.layers.11.mlp.fc2.weight', 'embedding_manager.embedder.transformer.text_model.encoder.layers.11.self_attn.k_proj.bias', 'embedding_manager.embedder.transformer.text_model.encoder.layers.11.self_attn.k_proj.weight', 'embedding_manager.embedder.transformer.text_model.encoder.layers.11.self_attn.out_proj.bias', 'embedding_manager.embedder.transformer.text_model.encoder.layers.11.self_attn.out_proj.weight', 'embedding_manager.embedder.transformer.text_model.encoder.layers.11.self_attn.q_proj.bias', 'embedding_manager.embedder.transformer.text_model.encoder.layers.11.self_attn.q_proj.weight', 'embedding_manager.embedder.transformer.text_model.encoder.layers.11.self_attn.v_proj.bias', 'embedding_manager.embedder.transformer.text_model.encoder.layers.11.self_attn.v_proj.weight', 'embedding_manager.embedder.transformer.text_model.encoder.layers.2.layer_norm1.bias', 'embedding_manager.embedder.transformer.text_model.encoder.layers.2.layer_norm1.weight', 'embedding_manager.embedder.transformer.text_model.encoder.layers.2.layer_norm2.bias', 'embedding_manager.embedder.transformer.text_model.encoder.layers.2.layer_norm2.weight', 'embedding_manager.embedder.transformer.text_model.encoder.layers.2.mlp.fc1.bias', 'embedding_manager.embedder.transformer.text_model.encoder.layers.2.mlp.fc1.weight', 'embedding_manager.embedder.transformer.text_model.encoder.layers.2.mlp.fc2.bias', 'embedding_manager.embedder.transformer.text_model.encoder.layers.2.mlp.fc2.weight', 'embedding_manager.embedder.transformer.text_model.encoder.layers.2.self_attn.k_proj.bias', 'embedding_manager.embedder.transformer.text_model.encoder.layers.2.self_attn.k_proj.weight', 'embedding_manager.embedder.transformer.text_model.encoder.layers.2.self_attn.out_proj.bias', 'embedding_manager.embedder.transformer.text_model.encoder.layers.2.self_attn.out_proj.weight', 'embedding_manager.embedder.transformer.text_model.encoder.layers.2.self_attn.q_proj.bias', 'embedding_manager.embedder.transformer.text_model.encoder.layers.2.self_attn.q_proj.weight', 'embedding_manager.embedder.transformer.text_model.encoder.layers.2.self_attn.v_proj.bias', 'embedding_manager.embedder.transformer.text_model.encoder.layers.2.self_attn.v_proj.weight', 'embedding_manager.embedder.transformer.text_model.encoder.layers.3.layer_norm1.bias', 'embedding_manager.embedder.transformer.text_model.encoder.layers.3.layer_norm1.weight', 'embedding_manager.embedder.transformer.text_model.encoder.layers.3.layer_norm2.bias', 'embedding_manager.embedder.transformer.text_model.encoder.layers.3.layer_norm2.weight', 'embedding_manager.embedder.transformer.text_model.encoder.layers.3.mlp.fc1.bias', 'embedding_manager.embedder.transformer.text_model.encoder.layers.3.mlp.fc1.weight', 'embedding_manager.embedder.transformer.text_model.encoder.layers.3.mlp.fc2.bias', 'embedding_manager.embedder.transformer.text_model.encoder.layers.3.mlp.fc2.weight', 'embedding_manager.embedder.transformer.text_model.encoder.layers.3.self_attn.k_proj.bias', 'embedding_manager.embedder.transformer.text_model.encoder.layers.3.self_attn.k_proj.weight', 'embedding_manager.embedder.transformer.text_model.encoder.layers.3.self_attn.out_proj.bias', 'embedding_manager.embedder.transformer.text_model.encoder.layers.3.self_attn.out_proj.weight', 'embedding_manager.embedder.transformer.text_model.encoder.layers.3.self_attn.q_proj.bias', 'embedding_manager.embedder.transformer.text_model.encoder.layers.3.self_attn.q_proj.weight', 'embedding_manager.embedder.transformer.text_model.encoder.layers.3.self_attn.v_proj.bias', 'embedding_manager.embedder.transformer.text_model.encoder.layers.3.self_attn.v_proj.weight', 'embedding_manager.embedder.transformer.text_model.encoder.layers.4.layer_norm1.bias', 'embedding_manager.embedder.transformer.text_model.encoder.layers.4.layer_norm1.weight', 'embedding_manager.embedder.transformer.text_model.encoder.layers.4.layer_norm2.bias', 'embedding_manager.embedder.transformer.text_model.encoder.layers.4.layer_norm2.weight', 'embedding_manager.embedder.transformer.text_model.encoder.layers.4.mlp.fc1.bias', 'embedding_manager.embedder.transformer.text_model.encoder.layers.4.mlp.fc1.weight', 'embedding_manager.embedder.transformer.text_model.encoder.layers.4.mlp.fc2.bias', 'embedding_manager.embedder.transformer.text_model.encoder.layers.4.mlp.fc2.weight', 'embedding_manager.embedder.transformer.text_model.encoder.layers.4.self_attn.k_proj.bias', 'embedding_manager.embedder.transformer.text_model.encoder.layers.4.self_attn.k_proj.weight', 'embedding_manager.embedder.transformer.text_model.encoder.layers.4.self_attn.out_proj.bias', 'embedding_manager.embedder.transformer.text_model.encoder.layers.4.self_attn.out_proj.weight', 'embedding_manager.embedder.transformer.text_model.encoder.layers.4.self_attn.q_proj.bias', 'embedding_manager.embedder.transformer.text_model.encoder.layers.4.self_attn.q_proj.weight', 'embedding_manager.embedder.transformer.text_model.encoder.layers.4.self_attn.v_proj.bias', 'embedding_manager.embedder.transformer.text_model.encoder.layers.4.self_attn.v_proj.weight', 'embedding_manager.embedder.transformer.text_model.encoder.layers.5.layer_norm1.bias', 'embedding_manager.embedder.transformer.text_model.encoder.layers.5.layer_norm1.weight', 'embedding_manager.embedder.transformer.text_model.encoder.layers.5.layer_norm2.bias', 'embedding_manager.embedder.transformer.text_model.encoder.layers.5.layer_norm2.weight', 'embedding_manager.embedder.transformer.text_model.encoder.layers.5.mlp.fc1.bias', 'embedding_manager.embedder.transformer.text_model.encoder.layers.5.mlp.fc1.weight', 'embedding_manager.embedder.transformer.text_model.encoder.layers.5.mlp.fc2.bias', 'embedding_manager.embedder.transformer.text_model.encoder.layers.5.mlp.fc2.weight', 'embedding_manager.embedder.transformer.text_model.encoder.layers.5.self_attn.k_proj.bias', 'embedding_manager.embedder.transformer.text_model.encoder.layers.5.self_attn.k_proj.weight', 'embedding_manager.embedder.transformer.text_model.encoder.layers.5.self_attn.out_proj.bias', 'embedding_manager.embedder.transformer.text_model.encoder.layers.5.self_attn.out_proj.weight', 'embedding_manager.embedder.transformer.text_model.encoder.layers.5.self_attn.q_proj.bias', 'embedding_manager.embedder.transformer.text_model.encoder.layers.5.self_attn.q_proj.weight', 'embedding_manager.embedder.transformer.text_model.encoder.layers.5.self_attn.v_proj.bias', 'embedding_manager.embedder.transformer.text_model.encoder.layers.5.self_attn.v_proj.weight', 'embedding_manager.embedder.transformer.text_model.encoder.layers.6.layer_norm1.bias', 'embedding_manager.embedder.transformer.text_model.encoder.layers.6.layer_norm1.weight', 'embedding_manager.embedder.transformer.text_model.encoder.layers.6.layer_norm2.bias', 'embedding_manager.embedder.transformer.text_model.encoder.layers.6.layer_norm2.weight', 'embedding_manager.embedder.transformer.text_model.encoder.layers.6.mlp.fc1.bias', 'embedding_manager.embedder.transformer.text_model.encoder.layers.6.mlp.fc1.weight', 'embedding_manager.embedder.transformer.text_model.encoder.layers.6.mlp.fc2.bias', 'embedding_manager.embedder.transformer.text_model.encoder.layers.6.mlp.fc2.weight', 'embedding_manager.embedder.transformer.text_model.encoder.layers.6.self_attn.k_proj.bias', 'embedding_manager.embedder.transformer.text_model.encoder.layers.6.self_attn.k_proj.weight', 'embedding_manager.embedder.transformer.text_model.encoder.layers.6.self_attn.out_proj.bias', 'embedding_manager.embedder.transformer.text_model.encoder.layers.6.self_attn.out_proj.weight', 'embedding_manager.embedder.transformer.text_model.encoder.layers.6.self_attn.q_proj.bias', 'embedding_manager.embedder.transformer.text_model.encoder.layers.6.self_attn.q_proj.weight', 'embedding_manager.embedder.transformer.text_model.encoder.layers.6.self_attn.v_proj.bias', 'embedding_manager.embedder.transformer.text_model.encoder.layers.6.self_attn.v_proj.weight', 'embedding_manager.embedder.transformer.text_model.encoder.layers.7.layer_norm1.bias', 'embedding_manager.embedder.transformer.text_model.encoder.layers.7.layer_norm1.weight', 'embedding_manager.embedder.transformer.text_model.encoder.layers.7.layer_norm2.bias', 'embedding_manager.embedder.transformer.text_model.encoder.layers.7.layer_norm2.weight', 'embedding_manager.embedder.transformer.text_model.encoder.layers.7.mlp.fc1.bias', 'embedding_manager.embedder.transformer.text_model.encoder.layers.7.mlp.fc1.weight', 'embedding_manager.embedder.transformer.text_model.encoder.layers.7.mlp.fc2.bias', 'embedding_manager.embedder.transformer.text_model.encoder.layers.7.mlp.fc2.weight', 'embedding_manager.embedder.transformer.text_model.encoder.layers.7.self_attn.k_proj.bias', 'embedding_manager.embedder.transformer.text_model.encoder.layers.7.self_attn.k_proj.weight', 'embedding_manager.embedder.transformer.text_model.encoder.layers.7.self_attn.out_proj.bias', 'embedding_manager.embedder.transformer.text_model.encoder.layers.7.self_attn.out_proj.weight', 'embedding_manager.embedder.transformer.text_model.encoder.layers.7.self_attn.q_proj.bias', 'embedding_manager.embedder.transformer.text_model.encoder.layers.7.self_attn.q_proj.weight', 'embedding_manager.embedder.transformer.text_model.encoder.layers.7.self_attn.v_proj.bias', 'embedding_manager.embedder.transformer.text_model.encoder.layers.7.self_attn.v_proj.weight', 'embedding_manager.embedder.transformer.text_model.encoder.layers.8.layer_norm1.bias', 'embedding_manager.embedder.transformer.text_model.encoder.layers.8.layer_norm1.weight', 'embedding_manager.embedder.transformer.text_model.encoder.layers.8.layer_norm2.bias', 'embedding_manager.embedder.transformer.text_model.encoder.layers.8.layer_norm2.weight', 'embedding_manager.embedder.transformer.text_model.encoder.layers.8.mlp.fc1.bias', 'embedding_manager.embedder.transformer.text_model.encoder.layers.8.mlp.fc1.weight', 'embedding_manager.embedder.transformer.text_model.encoder.layers.8.mlp.fc2.bias', 'embedding_manager.embedder.transformer.text_model.encoder.layers.8.mlp.fc2.weight', 'embedding_manager.embedder.transformer.text_model.encoder.layers.8.self_attn.k_proj.bias', 'embedding_manager.embedder.transformer.text_model.encoder.layers.8.self_attn.k_proj.weight', 'embedding_manager.embedder.transformer.text_model.encoder.layers.8.self_attn.out_proj.bias', 'embedding_manager.embedder.transformer.text_model.encoder.layers.8.self_attn.out_proj.weight', 'embedding_manager.embedder.transformer.text_model.encoder.layers.8.self_attn.q_proj.bias', 'embedding_manager.embedder.transformer.text_model.encoder.layers.8.self_attn.q_proj.weight', 'embedding_manager.embedder.transformer.text_model.encoder.layers.8.self_attn.v_proj.bias', 'embedding_manager.embedder.transformer.text_model.encoder.layers.8.self_attn.v_proj.weight', 'embedding_manager.embedder.transformer.text_model.encoder.layers.9.layer_norm1.bias', 'embedding_manager.embedder.transformer.text_model.encoder.layers.9.layer_norm1.weight', 'embedding_manager.embedder.transformer.text_model.encoder.layers.9.layer_norm2.bias', 'embedding_manager.embedder.transformer.text_model.encoder.layers.9.layer_norm2.weight', 'embedding_manager.embedder.transformer.text_model.encoder.layers.9.mlp.fc1.bias', 'embedding_manager.embedder.transformer.text_model.encoder.layers.9.mlp.fc1.weight', 'embedding_manager.embedder.transformer.text_model.encoder.layers.9.mlp.fc2.bias', 'embedding_manager.embedder.transformer.text_model.encoder.layers.9.mlp.fc2.weight', 'embedding_manager.embedder.transformer.text_model.encoder.layers.9.self_attn.k_proj.bias', 'embedding_manager.embedder.transformer.text_model.encoder.layers.9.self_attn.k_proj.weight', 'embedding_manager.embedder.transformer.text_model.encoder.layers.9.self_attn.out_proj.bias', 'embedding_manager.embedder.transformer.text_model.encoder.layers.9.self_attn.out_proj.weight', 'embedding_manager.embedder.transformer.text_model.encoder.layers.9.self_attn.q_proj.bias', 'embedding_manager.embedder.transformer.text_model.encoder.layers.9.self_attn.q_proj.weight', 'embedding_manager.embedder.transformer.text_model.encoder.layers.9.self_attn.v_proj.bias', 'embedding_manager.embedder.transformer.text_model.encoder.layers.9.self_attn.v_proj.weight', 'embedding_manager.embedder.transformer.text_model.final_layer_norm.bias', 'embedding_manager.embedder.transformer.text_model.final_layer_norm.weight', 'model_ema.decay', 'model_ema.num_updates', 'cond_stage_model.clip_l.transformer.text_model.embeddings.position_ids']) [AnimateDiffEvo] - INFO - Loading motion module mm_sd_v15_v2.ckpt Requested to load SD1ClipModel Loading 1 new model [AnimateDiffEvo] - INFO - Sliding context window activated - latents passed in (50) greater than context_length 16. [AnimateDiffEvo] - INFO - Using motion module mm_sd_v15_v2.ckpt version v2. Requested to load BaseModel Requested to load AnimateDiffModel Loading 2 new models 100%|████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 32/32 [37:03<00:00, 69.49s/it] Global Step: 840001 Using split attention in VAE Working with z of shape (1, 4, 32, 32) = 4096 dimensions. Using split attention in VAE Leftover VAE keys ['model_ema.decay', 'model_ema.num_updates'] Requested to load AutoencoderKL Loading 1 new model Loads SAM model: /Users/kevin/ComfyUI/models/sams/sam_vit_b_01ec64.pth (device:AUTO)

0: 640x448 1 face, 121.6ms Speed: 1.4ms preprocess, 121.6ms inference, 0.6ms postprocess per image at shape (1, 3, 640, 448) semd to mps

0: 640x448 1 face, 115.7ms Speed: 1.3ms preprocess, 115.7ms inference, 0.4ms postprocess per image at shape (1, 3, 640, 448) semd to mps

0: 640x448 1 face, 117.2ms Speed: 1.1ms preprocess, 117.2ms inference, 0.4ms postprocess per image at shape (1, 3, 640, 448) semd to mps

0: 640x448 1 face, 114.4ms Speed: 1.0ms preprocess, 114.4ms inference, 0.4ms postprocess per image at shape (1, 3, 640, 448) semd to mps

0: 640x448 1 face, 112.0ms Speed: 1.0ms preprocess, 112.0ms inference, 0.4ms postprocess per image at shape (1, 3, 640, 448) semd to mps

0: 640x448 1 face, 116.1ms Speed: 1.6ms preprocess, 116.1ms inference, 0.5ms postprocess per image at shape (1, 3, 640, 448) semd to mps

0: 640x448 1 face, 106.6ms Speed: 1.1ms preprocess, 106.6ms inference, 1.0ms postprocess per image at shape (1, 3, 640, 448) semd to mps

0: 640x448 1 face, 109.8ms Speed: 1.0ms preprocess, 109.8ms inference, 0.4ms postprocess per image at shape (1, 3, 640, 448) semd to mps

0: 640x448 1 face, 107.9ms Speed: 1.2ms preprocess, 107.9ms inference, 0.5ms postprocess per image at shape (1, 3, 640, 448) semd to mps

0: 640x448 1 face, 111.0ms Speed: 1.0ms preprocess, 111.0ms inference, 0.4ms postprocess per image at shape (1, 3, 640, 448) semd to mps

0: 640x448 1 face, 106.8ms Speed: 1.1ms preprocess, 106.8ms inference, 0.4ms postprocess per image at shape (1, 3, 640, 448) semd to mps

0: 640x448 1 face, 104.6ms Speed: 1.0ms preprocess, 104.6ms inference, 0.4ms postprocess per image at shape (1, 3, 640, 448) semd to mps

0: 640x448 1 face, 108.1ms Speed: 1.1ms preprocess, 108.1ms inference, 0.4ms postprocess per image at shape (1, 3, 640, 448) semd to mps

0: 640x448 1 face, 114.4ms Speed: 1.3ms preprocess, 114.4ms inference, 0.3ms postprocess per image at shape (1, 3, 640, 448) semd to mps

0: 640x448 1 face, 109.0ms Speed: 1.1ms preprocess, 109.0ms inference, 0.4ms postprocess per image at shape (1, 3, 640, 448) semd to mps

0: 640x448 1 face, 112.3ms Speed: 1.1ms preprocess, 112.3ms inference, 0.4ms postprocess per image at shape (1, 3, 640, 448) semd to mps

0: 640x448 1 face, 110.3ms Speed: 1.0ms preprocess, 110.3ms inference, 0.7ms postprocess per image at shape (1, 3, 640, 448) semd to mps

0: 640x448 1 face, 110.6ms Speed: 1.0ms preprocess, 110.6ms inference, 0.4ms postprocess per image at shape (1, 3, 640, 448) semd to mps

0: 640x448 1 face, 111.5ms Speed: 1.0ms preprocess, 111.5ms inference, 1.7ms postprocess per image at shape (1, 3, 640, 448) semd to mps

0: 640x448 1 face, 114.8ms Speed: 1.0ms preprocess, 114.8ms inference, 0.7ms postprocess per image at shape (1, 3, 640, 448) semd to mps

0: 640x448 1 face, 116.6ms Speed: 1.2ms preprocess, 116.6ms inference, 0.4ms postprocess per image at shape (1, 3, 640, 448) semd to mps

0: 640x448 1 face, 118.0ms Speed: 1.2ms preprocess, 118.0ms inference, 0.6ms postprocess per image at shape (1, 3, 640, 448) semd to mps

0: 640x448 1 face, 112.3ms Speed: 1.1ms preprocess, 112.3ms inference, 0.4ms postprocess per image at shape (1, 3, 640, 448) semd to mps

0: 640x448 1 face, 113.6ms Speed: 1.1ms preprocess, 113.6ms inference, 0.7ms postprocess per image at shape (1, 3, 640, 448) semd to mps

0: 640x448 1 face, 112.5ms Speed: 1.0ms preprocess, 112.5ms inference, 0.5ms postprocess per image at shape (1, 3, 640, 448) semd to mps

0: 640x448 1 face, 106.2ms Speed: 1.0ms preprocess, 106.2ms inference, 0.4ms postprocess per image at shape (1, 3, 640, 448) semd to mps

0: 640x448 1 face, 109.1ms Speed: 1.0ms preprocess, 109.1ms inference, 0.4ms postprocess per image at shape (1, 3, 640, 448) semd to mps

0: 640x448 1 face, 119.1ms Speed: 1.0ms preprocess, 119.1ms inference, 0.4ms postprocess per image at shape (1, 3, 640, 448) semd to mps

0: 640x448 1 face, 109.3ms Speed: 1.0ms preprocess, 109.3ms inference, 0.3ms postprocess per image at shape (1, 3, 640, 448) semd to mps

0: 640x448 1 face, 109.2ms Speed: 1.0ms preprocess, 109.2ms inference, 0.4ms postprocess per image at shape (1, 3, 640, 448) semd to mps

0: 640x448 1 face, 102.2ms Speed: 1.0ms preprocess, 102.2ms inference, 0.3ms postprocess per image at shape (1, 3, 640, 448) semd to mps

0: 640x448 1 face, 107.3ms Speed: 1.0ms preprocess, 107.3ms inference, 0.6ms postprocess per image at shape (1, 3, 640, 448) semd to mps

0: 640x448 1 face, 109.0ms Speed: 1.0ms preprocess, 109.0ms inference, 0.3ms postprocess per image at shape (1, 3, 640, 448) semd to mps

0: 640x448 1 face, 108.5ms Speed: 1.0ms preprocess, 108.5ms inference, 1.5ms postprocess per image at shape (1, 3, 640, 448) semd to mps

0: 640x448 1 face, 103.2ms Speed: 1.3ms preprocess, 103.2ms inference, 0.4ms postprocess per image at shape (1, 3, 640, 448) semd to mps

0: 640x448 1 face, 104.2ms Speed: 1.1ms preprocess, 104.2ms inference, 1.2ms postprocess per image at shape (1, 3, 640, 448) semd to mps

0: 640x448 1 face, 103.5ms Speed: 1.0ms preprocess, 103.5ms inference, 0.3ms postprocess per image at shape (1, 3, 640, 448) semd to mps

0: 640x448 1 face, 108.0ms Speed: 1.3ms preprocess, 108.0ms inference, 0.4ms postprocess per image at shape (1, 3, 640, 448) semd to mps

0: 640x448 1 face, 112.7ms Speed: 1.0ms preprocess, 112.7ms inference, 0.7ms postprocess per image at shape (1, 3, 640, 448) semd to mps

0: 640x448 1 face, 109.1ms Speed: 1.0ms preprocess, 109.1ms inference, 0.3ms postprocess per image at shape (1, 3, 640, 448) semd to mps

0: 640x448 1 face, 111.2ms Speed: 1.1ms preprocess, 111.2ms inference, 0.3ms postprocess per image at shape (1, 3, 640, 448) semd to mps

0: 640x448 1 face, 108.0ms Speed: 1.1ms preprocess, 108.0ms inference, 0.4ms postprocess per image at shape (1, 3, 640, 448) semd to mps

0: 640x448 1 face, 107.3ms Speed: 1.1ms preprocess, 107.3ms inference, 0.3ms postprocess per image at shape (1, 3, 640, 448) semd to mps

0: 640x448 1 face, 101.5ms Speed: 1.1ms preprocess, 101.5ms inference, 0.8ms postprocess per image at shape (1, 3, 640, 448) semd to mps

0: 640x448 1 face, 112.3ms Speed: 1.1ms preprocess, 112.3ms inference, 0.4ms postprocess per image at shape (1, 3, 640, 448) semd to mps

0: 640x448 1 face, 110.1ms Speed: 1.1ms preprocess, 110.1ms inference, 0.8ms postprocess per image at shape (1, 3, 640, 448) semd to mps

0: 640x448 1 face, 111.0ms Speed: 1.0ms preprocess, 111.0ms inference, 0.3ms postprocess per image at shape (1, 3, 640, 448) semd to mps

0: 640x448 1 face, 104.3ms Speed: 1.3ms preprocess, 104.3ms inference, 0.4ms postprocess per image at shape (1, 3, 640, 448) semd to mps

0: 640x448 1 face, 106.4ms Speed: 1.0ms preprocess, 106.4ms inference, 1.1ms postprocess per image at shape (1, 3, 640, 448) semd to mps

0: 640x448 1 face, 106.9ms Speed: 1.1ms preprocess, 106.9ms inference, 0.4ms postprocess per image at shape (1, 3, 640, 448) semd to mps

of Detected SEGS: 1

Detailer: segment upscale for ((371, 425)) | crop region (512, 768) x 1.0008322961447127 -> (512, 768) [DBG1] seed:7, steps:32, cfg:9.0, sampler_name:dpmpp_2m, scheduler:karras, denoise:0.4 ERROR:root:!!! Exception during processing !!! ERROR:root:Traceback (most recent call last): File "/Users/kevin/ComfyUI/execution.py", line 153, in recursive_execute output_data, output_ui = get_output_data(obj, input_data_all) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/Users/kevin/ComfyUI/execution.py", line 83, in get_output_data return_values = map_node_over_list(obj, input_data_all, obj.FUNCTION, allow_interrupt=True) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/Users/kevin/ComfyUI/execution.py", line 76, in map_node_over_list results.append(getattr(obj, func)(**slice_dict(input_data_all, i))) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/Users/kevin/ComfyUI/custom_nodes/ComfyUI-Impact-Pack/modules/impact/segs_nodes.py", line 198, in doit segs = SEGSDetailerForAnimateDiff.do_detail(image_frames, segs, guide_size, guide_size_for, max_size, seed, steps, cfg, sampler_name, ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/Users/kevin/ComfyUI/custom_nodes/ComfyUI-Impact-Pack/modules/impact/segs_nodes.py", line 179, in do_detail enhanced_image_tensor = core.enhance_detail_for_animatediff(cropped_image_frames, model, clip, vae, guide_size, guide_size_for, max_size, ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/Users/kevin/ComfyUI/custom_nodes/ComfyUI-Impact-Pack/modules/impact/core.py", line 385, in enhance_detail_for_animatediff refined_latent = ksampler_wrapper(model, seed, steps, cfg, sampler_name, scheduler, positive, negative, ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/Users/kevin/ComfyUI/custom_nodes/ComfyUI-Impact-Pack/modules/impact/core.py", line 72, in ksampler_wrapper print(f"[DBG2] latent_image:{latent_image.shape}") ^^^^^^^^^^^^^^^^^^ AttributeError: 'dict' object has no attribute 'shape'

Prompt executed in 2305.66 seconds

ltdrdata commented 11 months ago

oh... my bad... dbg2 should be this.

print(f"[DBG2] latent_image:{latent_image['samples'].shape}")

kevingre commented 11 months ago

Here's the same run with the new debug statements:

[DBG1] seed:7, steps:32, cfg:9.0, sampler_name:dpmpp_2m, scheduler:karras, denoise:0.4 [DBG2] latent_image:torch.Size([50, 4, 96, 64])

kevingre commented 11 months ago

Doing a 50 frame animation, sometimes trying more, sometimes less. After doing alot of testing, it seems to work SOME of the time - depending on how big the detected face is. E.g . sometimes it succeeds on numbers as high as 64 and sometimes it fails with numbers as low as 32. Here's the workflow: Txt2Vid_FaceDetailed_workflow.json

ltdrdata commented 11 months ago

Doing a 50 frame animation, sometimes trying more, sometimes less. After doing alot of testing, it seems to work SOME of the time - depending on how big the detected face is. E.g . sometimes it succeeds on numbers as high as 64 and sometimes it fails with numbers as low as 32. Here's the workflow: Txt2Vid_FaceDetailed_workflow.json

Create an Empty Latent composed of a 50-batch with dimensions 512x768 and try applying it to KSampler without the Detailer. With same setup. seed:7, steps:32, cfg:9.0, sampler_name:dpmpp_2m, scheduler:karras, denoise:0.4

I guess that it is a resolution issue.