kijai / ComfyUI-LivePortraitKJ

ComfyUI nodes for LivePortrait
MIT License
1.34k stars 98 forks source link

Develop Branch: Face index error when no face detected and index error when retargeting beyond index 0 #68

Open bigandtallrecords opened 1 month ago

bigandtallrecords commented 1 month ago

Hello,

I'm getting this error when setting up two LivePortraitCropper nodes, one index set to 0 the other to 1. Obviously its throwing an error when not detecting a second face. But would love some workaround to allow it to still work even if a face isn't detected. Maybe a button for skip if not detected. So that if you are going through many frames with many different scenes that the amount of characters changes often it doesn't stop every time the index loses track. Not sure if theres a way to do this currently with a condition statement that if no extra faces are detected then always return 0 or something so that it could continue. I've attached some screenshots of my setup trying to get two cropper nodes to work simultaneously with only one face being detect (but two faces are coming up later the frame count)

So the issue is as soon as eyes_retargetting and lips_retargetting are enabled, the index that is anything above 0 throws this error

Error occurred when executing LivePortraitCropper:

list index out of range

File "C:\Users\admin\pinokio\api\comfyui.git\app\execution.py", line 151, in recursive_execute output_data, output_ui = get_output_data(obj, input_data_all) File "C:\Users\admin\pinokio\api\comfyui.git\app\execution.py", line 81, in get_output_data return_values = map_node_over_list(obj, input_data_all, obj.FUNCTION, allow_interrupt=True) File "C:\Users\admin\pinokio\api\comfyui.git\app\execution.py", line 74, in map_node_over_list results.append(getattr(obj, func)(**slice_dict(input_data_all, i))) File "C:\Users\admin\pinokio\api\comfyui.git\app\custom_nodes\ComfyUI-LivePortraitKJ-develop\nodes.py", line 414, in process driving_crop_dict = self.cropper.crop_single_image(driving_images_np[i], dsize, scale, vy_ratio, vx_ratio, face_index, rotate) File "C:\Users\admin\pinokio\api\comfyui.git\app\custom_nodes\ComfyUI-LivePortraitKJ-develop\liveportrait\utils\cropper.py", line 59, in crop_single_image src_face = src_face[face_index] # choose the index if multiple faces detected

and if the opt_driving_image isn't connected then this error:

Error occurred when executing LivePortraitCropper:

list index out of range

File "C:\Users\admin\pinokio\api\comfyui.git\app\execution.py", line 151, in recursive_execute output_data, output_ui = get_output_data(obj, input_data_all) File "C:\Users\admin\pinokio\api\comfyui.git\app\execution.py", line 81, in get_output_data return_values = map_node_over_list(obj, input_data_all, obj.FUNCTION, allow_interrupt=True) File "C:\Users\admin\pinokio\api\comfyui.git\app\execution.py", line 74, in map_node_over_list results.append(getattr(obj, func)(**slice_dict(input_data_all, i))) File "C:\Users\admin\pinokio\api\comfyui.git\app\custom_nodes\ComfyUI-LivePortraitKJ-develop\nodes.py", line 408, in process crop_info = self.cropper.crop_single_image(source_image_np[i], dsize, scale, vy_ratio, vx_ratio, face_index, rotate) File "C:\Users\admin\pinokio\api\comfyui.git\app\custom_nodes\ComfyUI-LivePortraitKJ-develop\liveportrait\utils\cropper.py", line 59, in crop_single_image src_face = src_face[face_index] # choose the index if multiple faces detected

so as you can see from my test setup here with multiple faces the detection of multiple faces is working, but when it tries to retarget those faces the index suddenly breaks, or requires the separate opt_driving_image but that throws the error that the index is out of range.

I'm trying to change many faces in a single pass, but there seems to be conflict when having multiple cropper nodes if I had to guess. So as soon as any try to to retarget everything breaks. But the index with retargeting off works as you can see from the screenshots, but retargeting can only be done correctly with the index if only one cropper is in the scene. Hope thats not too confusing.

Thanks for the help!

Screenshot 2024-07-14 at 11 28 50 AM Screenshot 2024-07-14 at 11 28 58 AM Screenshot 2024-07-14 at 11 28 43 AM Screenshot 2024-07-14 at 10 20 46 AM Screenshot 2024-07-14 at 10 28 21 AM Screenshot 2024-07-14 at 10 28 40 AM
jerrydavos commented 1 month ago

Hello,

Yes, I also have a similar problem, I was gonna make the same "suggestion" thread, but saw this....

image

So, I am testing it on Video2Video lip swap, It's working as expected when faces of both source and reference videos are visible 100% of the duration. Even if 1 single frame has a hand covering the face, or face is turned around, it gives an error.

Maybe we can skip these "no_face_detected" frames without throwing any error and continue with the process ? @kijai

kijai commented 1 month ago

If it's just one frame sure, skipping or just re-using previous one should work well enough. I'm not sure how to go about interpolating the keypoints, could be an option as well though.

jerrydavos commented 1 month ago

I have some idea for this...

For Face Detection:

1) What I observed during course of making face workflows, that the bbox face yolo models have the best detection success, say 95% detection rate even if the face is partially covered by some object, far or near, image

2) The MediaPipe Face has , say 60-70 percent success rate, if faces are far away it fails, But it has the option to segment elements of faces like eyes, mouth...etc image

So, What I did is that combined the face detection of 1, cropped the faces by it's detection mask, so we have close up of faces, then apply the mediapipe face, and segment the lips and eyes of the faces.

By combining 1 and 2, the overall success rate of the face detection and applying increased to about 80-90 percent. Example: https://www.reddit.com/r/StableDiffusion/comments/1b355tu/make_better_dialogues_with_this_new_lip_sync/ But the above is only limited to video which have the same face and movement unlike live portrait

Applied Solutions:

Say: 100 frames (20 frames partial face, 5 frames with no face)

A: For Heads which is turned over and no face is detected, no "LivePortrait" is applied and are those frame are entirely skipped. Same frame is skipped from the driving frames also so It is synced properly. B: For Partial Visible Face like Only lips or Only eyes... "Partial LivePortrait" is applied accordingly. Like the mediapipe options to isolate those elements. C: For Fully Visible Faces "LivePortrait" is applied as usual.

So after render, the final video will look like (C + B +A).mp4 [100 frames] = 75 Frames (LivePortrait Applied) + 20 (Partial LivePortrait applied) + 5 (No LivePortrait applied)

So, it can run smoothly without detection error.

I hope it sparks some ideas

SmileMonkeyGIF

kijai commented 1 month ago

Problem for any partial solution is probably that the model excepts certain amount of keypoints, I don't know if it can work with different amounts. I have some ideas for that and to try with other detection models besides insightface.

bigandtallrecords commented 1 month ago

Problem for any partial solution is probably that the model excepts certain amount of keypoints, I don't know if it can work with different amounts. I have some ideas for that and to try with other detection models besides insightface.

Hey @kijai Yes that makes sense. If there is a way to skip the frame if full keypoints aren't found that could be useful. One idea I had was to possibly have it work like the React node pictured here with the index. In this case rather than the index being a separate node possibly combine the index functionality in to a single node. (and always reverting to 0 and 0 if only a single face is detected)

Screenshot 2024-07-14 at 6 36 52 PM

A feature that would be cool, is if adding more face selections (like an index) created more node inputs for separate drivers inputs.

But I wanted to point out that the "index out of range" issue seems to be centered around the opt_driving_images node input. Once that becomes active the indexing of faces breaks entirely. Wanted to point out that this issue seems unrelated to the face detection missing a frame. They are two different class of crashing.

Since I am still yet to get two faces independent faces in the same frame working. As it will find all of the faces and detect them fine AS LONG as lip retargeting and eye retargeting remain false. As soon as an index is above 0 and eye retargeting and lip retargeting are true. That error is thrown. Just to clarify that the problem may be two fold.

Thanks for the workaround suggestion will give these ideas a shot @jerrydavos

bigandtallrecords commented 1 month ago

@kijai one last follow up to clarify. So in this particular screenshot here, the reason the detection is actually working is because all nodes eye and lip retargeting are set to false. As soon as any one of these nodes retargeting are actually enabled the error is thrown at that node. With each of these groups of nodes with a face being a LivePortraitCropper and a LivePortraitProcess.

348512283-29123cf6-4456-43ed-9158-4917a29af3e7
kijai commented 1 month ago

Yeah the eye/lip retargeting mode changes things up a bit as it needs to process the driving video as well, it's one of the reasons I don't want to merge this to main yet, I think that mode needs to be in separate node at least partly.

bigandtallrecords commented 1 month ago

@kijai yea seems like a lot is happening under the hood there. It's not all that hard to set up separately anyways. Thanks for the information and good luck developing.