lazniak / Head-Orientation-Node-for-ComfyUI---by-PabloGFX

Head Orientation Node for ComfyUI: Analyze and sort images based on facial orientation using MediaPipe. This custom node detects facial landmarks, calculates head pose, and intelligently sorts images for enhanced AI image processing workflows.
Apache License 2.0
3 stars 1 forks source link

Request for an output the rotation angle #3

Open vmetternich opened 1 week ago

vmetternich commented 1 week ago

Hey there, thanks for sharing your node. Hope this reaches you well, and its ok to post that in this way (in this issue section). Would it be possible to retrieve the rotation angle and make an output for that?

Background: Most trained character (face) loras (and models) have been trained on relatively straight faces, like in portraits.

So my thoughts were, rotate the image accordingly to the face rotation so that its easy to apply the lora in a inpainting step or in a face detailer, then rotate the image back. That node would have the image (batch) as input and the angle (float) as output. if nothing could be detected then a "0.0" default would be good enough I guess.

As I do not have a lot of python exp. and no clue about the libraries used, and since you already dived somewhat deep into the values, could you make a node that measures such an rotation angle and outputs that in some angle float so that could be used to transform it? Maybe im asking for too much, but i think that would be a really cool addition.

lazniak commented 3 days ago

sure, I'll ask you for a slightly more precise specification of what output you would expect, could it be a string? in the form of, for example:

[X,Y,Z]

e.g.

for each image from batch you would get something like this in the output:

[0.0,0.0,0.0] - when the face is perfectly turned to the camera.

I don't remember exactly which axes are which, I'll have to verify it, but a deviation to the right and left will then give us a positive and negative float. most likely due to the very limitation of mediapipe detection with very strongly tilted faces, it may give a not very accurate measurement on the edges of the tilts. I assume that within the value of 45 degrees of the left-right tilt angle, the measurement should be relatively correct and precise.

if you put more than one photo of the face in the input, then you would get an output string in the form of a multiline.

[X,Y,Z] [X,Y,Z] [X,Y,Z] [X,Y,Z] [X,Y,Z] ....

Would this be ok for you?

lazniak commented 3 days ago

ok, I implement the output with data in string format:

image

vmetternich commented 1 day ago

Wow, that is great :) Thanks for adding that, will try it out, think i can work with that. Since i want to turn the whole image with a node that transforms that way, i got to figure / filter out the right value. Cannot wait to try it out! Thank god its friday :) Thanks again! Great work!