wongfei / ue4-mediapipe-plugin

UE4 MediaPipe plugin
Apache License 2.0
291 stars 99 forks source link

Some problems in practical use #4

Closed salier closed 3 years ago

salier commented 3 years ago

Visibility and presence should mean the same thing. I don't understand the difference between the two image Pose point has only position vector and no directional Euler angle, which leads to low efficiency in driving the model image I try to calculate by mathematical operation, but the effect is not very ideal. image Since last year, I have had the idea of porting mediapipe to fantasy. But my code is very bad. I spend more time using blueprints. Thank you very much for making this plug-in. I want to make it better

salier commented 3 years ago

Another problem is about the face. I found the coordinate positions of more than 400 points. I should have the method of directly covering the model when viewing the document (but I didn't see it in the blueprint). It is also very difficult to use coordinates without direction directly (even with facial bones) I hope the plug-in can have the function to drive the model directly

wongfei commented 3 years ago

Visibility and presence should mean the same thing. I don't understand the difference between the two

See https://github.com/google/mediapipe/blob/master/mediapipe/framework/formats/landmark.proto#L25

Pose point has only position vector and no directional Euler angle, which leads to low efficiency in driving the model

MediaPipe doesn't provide pose angles (https://google.github.io/mediapipe/solutions/pose.html#output). Have to compute them or transform bones using only position in world/root space.

Also check: Poseable Mesh Component / Set Bone Transform by Name (https://docs.unrealengine.com/4.26/en-US/BlueprintAPI/Components/PoseableMesh/SetBoneTransformbyName/), that way you can drive avatar bones directly from blueprint.

I hope the plug-in can have the function to drive the model directly

Direct driving model from motion capture is complex problem (including calibration, skeleton and face retargeting). Even pro grade software struggles with it. Unless somebody releases open source retargeting library, it's not coming here..

salier commented 3 years ago

Thank you very much for your reply. image I rechecked this part of the document. Indeed, it only has visibility. There is no presence. I don't know the function of presence. Maybe you understand me wrong. Face mesh contains the coordinates of 468 points. I can redirect it in a pose like manner. (but I need to correctly apply 468 facial bones to each model) (this is a big project. And it will make the work more troublesome) I see a face mesh rendering mode. You can get a similar effect by stretching the face skin directly. I mean, can you realize this function through a plug-in image As you can see, I'm doing data processing to drive the model This is what you call redirection workflow I didn't mean to complain(My English is not good. Maybe I understand it wrong. Maybe I write something wrong) I have redirected Kinect and azure Kinect data processing before. If you are interested, we can communicate I'll update the progress here

wongfei commented 3 years ago

https://github.com/google/mediapipe/blob/master/mediapipe/framework/formats/landmark.proto#L25

message Landmark {
  optional float x = 1;
  optional float y = 2;
  optional float z = 3;

  // Landmark visibility. Should stay unset if not supported.
  // Float score of whether landmark is visible or occluded by other objects.
  // Landmark considered as invisible also if it is not present on the screen
  // (out of scene bounds). Depending on the model, visibility value is either a
  // sigmoid or an argument of sigmoid.
  optional float visibility = 4;

  // Landmark presence. Should stay unset if not supported.
  // Float score of whether landmark is present on the scene (located within
  // scene bounds). Depending on the model, presence value is either a result of
  // sigmoid or an argument of sigmoid function to get landmark presence
  // probability.
  optional float presence = 5;
}

Most MediaPipe models in current impl provide only visibility property.

Face mesh contains the coordinates of 468 points. I can redirect it in a pose like manner. (but I need to correctly apply 468 facial bones to each model) (this is a big project. And it will make the work more troublesome)

FaceMesh is just an alias to FaceLandmarkFrontCpu + FaceGeometryFromLandmarks calculators. FaceLandmarkFrontCpu provides landmarks in camera (2d) space. FaceGeometryFromLandmarks converts landmarks to canonical face geometry/mesh in 3d metric space.

It can be accessed with MediaPipeFaceMeshObserverComponent/GetMesh: https://github.com/wongfei/ue4-mediapipe-plugin/blob/master/Plugins/MediaPipe/Source/MediaPipe/Public/MediaPipeFaceMeshObserverComponent.h#L53

That method is good for augmented reality (like overlaying googles, makeup or face masks over video stream) but hardly suitable for 3d avatar driving (because of proportion and geometry difference between captured face and avatar face). From my knowledge face retargeting mostly done using blend shapes.

See paper: "Attention Mesh: High-fidelity Face Mesh Prediction in Real-time" https://arxiv.org/pdf/2006.10962.pdf

salier commented 3 years ago

Thank you for your reply. image I can't understand the meaning of these codes. I'm very sorry. Indeed, most facial capture driven models use mixed expressions. For example, arkit and arcore. Mediapipe and arcore are Google projects. Maybe arcore can also be deployed in win unreal (or mediapipe also has mixed expression weight)( I use unreal to build the face capture of arcore (but I can't get the mixed expression weight. It also annoys me. It obviously has this function, but it can't be used

salier commented 3 years ago

https://user-images.githubusercontent.com/28691601/127771405-75b50074-0a7b-4820-aaf8-da0b6e848ea7.mp4

Progress update

salier commented 3 years ago

I found that two functions cannot be enabled in a scene. Even if it is started in sequence, only the first one works. I tried to write multiple functions in one actor, but I failed. Is there a way to enable multiple modules at the same time? I want to start face tracking, eye tracking and finger tracking when I get a 3D pose, because the pose of holisticlandmarks is a 2D coordinate.

salier commented 3 years ago

I have a special situation. The whole computer can only use one project. After copying, renaming or migration, the selected camera will not be available. (sometimes even if it is available, it will not start normally)

wongfei commented 3 years ago

Is there a way to enable multiple modules at the same time?

Holistic provides all possible landmarks:

node {
  calculator: "HolisticLandmarkCpu"
  input_stream: "IMAGE:throttled_input_video"
  output_stream: "POSE_LANDMARKS:pose_landmarks"
  output_stream: "WORLD_LANDMARKS:pose_world_landmarks"
  output_stream: "POSE_ROI:pose_roi"
  output_stream: "POSE_DETECTION:pose_detection"
  output_stream: "FACE_LANDMARKS:face_landmarks"
  output_stream: "LEFT_HAND_LANDMARKS:left_hand_landmarks"
  output_stream: "RIGHT_HAND_LANDMARKS:right_hand_landmarks"
}

But in case you need custom pipeline have to create graph file with multiple outputs, see: MP_FaceLandmarksWithIris and ue4-mediapipe-plugin\Plugins\MediaPipe\ThirdParty\mediapipe\Data\mediapipe\unreal\face_landmarks_with_iris.pbtxt

salier commented 3 years ago

Thank you very much for your guidance. I tried to copy the input and output and nodes related to 3dpose. It doesn't work properly. I tried to add 3D pose to face and iris, but I failed.This is my imitation. pose.txt This is my setting image image

wongfei commented 3 years ago

this one combines all possible landmarks (body 2d + body 3d + hands + face + iris):

holistic_with_iris.txt

salier commented 3 years ago

Thank you very much for your method. He is working normally now. I have another question to ask. About the world coordinates of pose. Now all coordinates are relative to the pelvis. I see that there are two clear world points in the document. This point is not in the 33 points of pose. You can use this point to judge the direction and distance between the person and the camera. Is this function in the plug-in? I didn't find the relevant data. I thought cameraorigin was this data (but I found that it was set by myself)

wongfei commented 3 years ago

there is big difference between POSE_LANDMARKS and POSE_WORLD_LANDMARKS,

POSE_LANDMARKS provides normalized [0..1] x, y in camera space and fake z/depth (from holistic documentation: should be discarded as currently the model is not fully trained to predict depth)

POSE_WORLD_LANDMARKS provides metric x, y, z with origin at the center between hips (midpoint of 23 / 24)

plugin just transfers that data from mediapipe to ue4 with optional axis swap and scale

salier commented 3 years ago

I see. Thank you for your answer.It seems that an external module is needed to detect the relative position

salier commented 3 years ago

I'm sorry to ask you more questions. I'm using your holistic with Iris redirects. However, it is found that the 23-24 positions of 3dpose are 0.0.0. You can see the movement on the debug screen. But the data is 0. After comparing with the independent module, I found that it is not the problem of the module. I don't know where the problem is. image This is my setting. image This is the output data 0.0.0. But it works normally image This is the case of a single 3dpose in the demo. The same actions and settings are the same (I modified the axis for redirection. But I found this problem after I modified it.too)

wongfei commented 3 years ago

Set MinVisibility and MinPresence to 0, use them with care because plugin will drop any landmarks with visibility/presence lower then these properties.

salier commented 3 years ago

image image It shouldn't be the problem. I looked at the point screen of debug and found that the two points would not tilt image

wongfei commented 3 years ago

dunno, maybe it's mediapipe bug/feature on specific camera/image, it works like intended on https://github.com/digital-standard/ThreeDPoseTracker/blob/master/SampleVideo/onegai_darling.mp4

x1

x2

salier commented 3 years ago

image I copied a pbtxt again. Then I called it again. I found that I could get the data again. But there was an error. Maybe there was invalid data in the array. It seems to be a strange bug. I have replaced the camera. It's not the camera's error. The problem should be on the pbtxt. image Data is sometimes valid and sometimes invalid. I'll try again

salier commented 3 years ago

At the same time, the blueprint using the original pbtxt still has no data

salier commented 3 years ago

It may also be the problem of the actor. I copied the actor and found that there will be differences

salier commented 3 years ago

image I only succeeded once. (and there was an error report, and the data was invalid). Then I copied it several times without transformation, and there was still no data. At the same time, there was an error report and invalid warning image

salier commented 3 years ago

I'm sorry to find the bug. I once created an array to import data for conversion. Multiple copies lead to problems in the transmission process. I'm sorry to disturb you. It's my mistake. I'm very sorry

aquibrash87 commented 2 years ago

@salier did you proceed in your goal. It's some how near to my goals and would appreciate your input. Thanks