Open endink opened 1 year ago
I also had a similar problem when I used this model: https://github.com/google/mediapipe/blob/master/mediapipe/tasks/testdata/vision/face_landmarker_with_blendshapes.task
Using "https://storage.googleapis.com/mediapipe-assets/face_landmarker_with_blendshapes.task" as the task, results in the following error when I use detect. Uncaught (in promise) Error: WaitUntilIdle failed: $CalculatorGraph::Run() failed in Run: Calculator::Process() for node "mediapipe_tasks_vision_face_landmarker_facelandmarkergraphmediapipe_tasks_vision_face_landmarker_multifacelandmarksdetectorgraph__mediapipe_tasks_vision_face_landmarker_singlefacelandmarksdetectorgraphmediapipe_tasks_vision_face_landmarker_faceblendshapesgraph__SplitNormalizedLandmarkListCalculator" failed: RET_CHECK failure (third_party/mediapipe/calculators/core/split_proto_list_calculator.cc:126) ListSize(input) >= max_rangeend (468 vs. 478) Max range end 478 exceeds list size 468; WaitUntilIdle failed at O.handleErrors (tasks-vision.js:7781:15) at O.finishProcessing (tasks-vision.js:7769:47) at O.process (tasks-vision.js:7883:228) at O.processVideoData (tasks-vision.js:7860:10) at O.detectForVideo (tasks-vision.js:7948:40)
Hello @endink, @baronha, Thank you both for detailed observation. The white lines overlay looks like incorrect landmark connections list and incorrect landmark id's. Further, We are reassigning issue to right owner to understand more about the issue. Thank you
Hello @endink, @baronha, Thank you both for detailed observation. The white lines overlay looks like incorrect landmark connections list and incorrect landmark id's. Further, We are reassigning issue to right owner to understand more about the issue. Thank you
Many thanks for your assistance @kuaashish
Hello @endink, @baronha, Thanks for looking into the newly provided face blendshapes solution! The face blendshape graph is not ready to be used as production yet, and we will publish the document once it is ready. Could you provide the whole graph you use in the legacy graph with face blendshape graph so that the we can take a look? Thanks!
@yichunk Hi, thanks for your reply, sorry for late.
Here is my graph:
node {
calculator: "ImageTransformationCalculator"
input_stream: "IMAGE:input_video"
output_stream: "IMAGE:transformed_image"
input_side_packet: "FLIP_HORIZONTALLY:input_horizontally_flipped"
input_side_packet: "FLIP_VERTICALLY:input_vertically_flipped"
input_side_packet: "ROTATION_DEGREES:input_rotation"
}
node {
calculator: "ImagePropertiesCalculator"
input_stream: "IMAGE:transformed_image"
output_stream: "SIZE:__stream_0"
}
node {
calculator: "HolisticLandmarkCpu"
input_stream: "IMAGE:transformed_image"
output_stream: "FACE_LANDMARKS:face_landmarks"
output_stream: "LEFT_HAND_LANDMARKS:left_hand_landmarks"
output_stream: "POSE_DETECTION:pose_detection"
output_stream: "POSE_LANDMARKS:pose_landmarks"
output_stream: "POSE_ROI:pose_roi"
output_stream: "RIGHT_HAND_LANDMARKS:right_hand_landmarks"
output_stream: "SEGMENTATION_MASK:segmentation_mask_rotated"
output_stream: "WORLD_LANDMARKS:pose_world_landmarks"
input_side_packet: "ENABLE_SEGMENTATION:enable_segmentation"
input_side_packet: "MODEL_COMPLEXITY:model_complexity"
input_side_packet: "REFINE_FACE_LANDMARKS:refine_face_landmarks"
input_side_packet: "SMOOTH_LANDMARKS:smooth_landmarks"
input_side_packet: "SMOOTH_SEGMENTATION:smooth_segmentation"
input_side_packet: "USE_PREV_LANDMARKS:use_prev_landmarks"
}
node {
calculator: "SplitNormalizedLandmarkListCalculator"
input_stream: "face_landmarks"
output_stream: "__stream_1"
options {
[mediapipe.SplitVectorCalculatorOptions.ext] {
ranges {
begin: 33
end: 34
}
ranges {
begin: 133
end: 134
}
combine_outputs: true
}
}
}
node {
calculator: "SplitNormalizedLandmarkListCalculator"
input_stream: "face_landmarks"
output_stream: "__stream_2"
options {
[mediapipe.SplitVectorCalculatorOptions.ext] {
ranges {
begin: 362
end: 363
}
ranges {
begin: 263
end: 264
}
combine_outputs: true
}
}
}
node {
calculator: "IrisLandmarkLeftAndRightCpu"
input_stream: "IMAGE:transformed_image"
input_stream: "LEFT_EYE_BOUNDARY_LANDMARKS:__stream_1"
input_stream: "RIGHT_EYE_BOUNDARY_LANDMARKS:__stream_2"
output_stream: "LEFT_EYE_CONTOUR_LANDMARKS:left_eye_contour_landmarks"
output_stream: "LEFT_EYE_IRIS_LANDMARKS:left_iris_landmarks"
output_stream: "LEFT_EYE_ROI:left_eye_rect"
output_stream: "RIGHT_EYE_CONTOUR_LANDMARKS:right_eye_contour_landmarks"
output_stream: "RIGHT_EYE_IRIS_LANDMARKS:right_iris_landmarks"
output_stream: "RIGHT_EYE_ROI:right_eye_rect"
}
node {
calculator: "ConcatenateNormalizedLandmarkListCalculator"
input_stream: "left_eye_contour_landmarks"
input_stream: "right_eye_contour_landmarks"
output_stream: "eye_landmarks"
}
node {
calculator: "UpdateFaceLandmarksCalculator"
input_stream: "FACE_LANDMARKS:face_landmarks"
input_stream: "NEW_EYE_LANDMARKS:eye_landmarks"
output_stream: "UPDATED_FACE_LANDMARKS:updated_face_landmarks"
}
node {
calculator: "ConcatenateNormalizedLandmarkListCalculator"
input_stream: "updated_face_landmarks"
input_stream: "left_iris_landmarks"
input_stream: "right_iris_landmarks"
output_stream: "__stream_3"
}
node {
calculator: "SplitNormalizedLandmarkListCalculator"
input_stream: "__stream_3"
output_stream: "__stream_4"
options {
[mediapipe.SplitVectorCalculatorOptions.ext] {
ranges {
begin: 0
end: 1
}
ranges {
begin: 1
end: 2
}
ranges {
begin: 4
end: 5
}
ranges {
begin: 5
end: 6
}
ranges {
begin: 6
end: 7
}
ranges {
begin: 7
end: 8
}
ranges {
begin: 8
end: 9
}
ranges {
begin: 10
end: 11
}
ranges {
begin: 13
end: 14
}
ranges {
begin: 14
end: 15
}
ranges {
begin: 17
end: 18
}
ranges {
begin: 21
end: 22
}
ranges {
begin: 33
end: 34
}
ranges {
begin: 37
end: 38
}
ranges {
begin: 39
end: 40
}
ranges {
begin: 40
end: 41
}
ranges {
begin: 46
end: 47
}
ranges {
begin: 52
end: 53
}
ranges {
begin: 53
end: 54
}
ranges {
begin: 54
end: 55
}
ranges {
begin: 55
end: 56
}
ranges {
begin: 58
end: 59
}
ranges {
begin: 61
end: 62
}
ranges {
begin: 63
end: 64
}
ranges {
begin: 65
end: 66
}
ranges {
begin: 66
end: 67
}
ranges {
begin: 67
end: 68
}
ranges {
begin: 70
end: 71
}
ranges {
begin: 78
end: 79
}
ranges {
begin: 80
end: 81
}
ranges {
begin: 81
end: 82
}
ranges {
begin: 82
end: 83
}
ranges {
begin: 84
end: 85
}
ranges {
begin: 87
end: 88
}
ranges {
begin: 88
end: 89
}
ranges {
begin: 91
end: 92
}
ranges {
begin: 93
end: 94
}
ranges {
begin: 95
end: 96
}
ranges {
begin: 103
end: 104
}
ranges {
begin: 105
end: 106
}
ranges {
begin: 107
end: 108
}
ranges {
begin: 109
end: 110
}
ranges {
begin: 127
end: 128
}
ranges {
begin: 132
end: 133
}
ranges {
begin: 133
end: 134
}
ranges {
begin: 136
end: 137
}
ranges {
begin: 144
end: 145
}
ranges {
begin: 145
end: 146
}
ranges {
begin: 146
end: 147
}
ranges {
begin: 148
end: 149
}
ranges {
begin: 149
end: 150
}
ranges {
begin: 150
end: 151
}
ranges {
begin: 152
end: 153
}
ranges {
begin: 153
end: 154
}
ranges {
begin: 154
end: 155
}
ranges {
begin: 155
end: 156
}
ranges {
begin: 157
end: 158
}
ranges {
begin: 158
end: 159
}
ranges {
begin: 159
end: 160
}
ranges {
begin: 160
end: 161
}
ranges {
begin: 161
end: 162
}
ranges {
begin: 162
end: 163
}
ranges {
begin: 163
end: 164
}
ranges {
begin: 168
end: 169
}
ranges {
begin: 172
end: 173
}
ranges {
begin: 173
end: 174
}
ranges {
begin: 176
end: 177
}
ranges {
begin: 178
end: 179
}
ranges {
begin: 181
end: 182
}
ranges {
begin: 185
end: 186
}
ranges {
begin: 191
end: 192
}
ranges {
begin: 195
end: 196
}
ranges {
begin: 197
end: 198
}
ranges {
begin: 234
end: 235
}
ranges {
begin: 246
end: 247
}
ranges {
begin: 249
end: 250
}
ranges {
begin: 251
end: 252
}
ranges {
begin: 263
end: 264
}
ranges {
begin: 267
end: 268
}
ranges {
begin: 269
end: 270
}
ranges {
begin: 270
end: 271
}
ranges {
begin: 276
end: 277
}
ranges {
begin: 282
end: 283
}
ranges {
begin: 283
end: 284
}
ranges {
begin: 284
end: 285
}
ranges {
begin: 285
end: 286
}
ranges {
begin: 288
end: 289
}
ranges {
begin: 291
end: 292
}
ranges {
begin: 293
end: 294
}
ranges {
begin: 295
end: 296
}
ranges {
begin: 296
end: 297
}
ranges {
begin: 297
end: 298
}
ranges {
begin: 300
end: 301
}
ranges {
begin: 308
end: 309
}
ranges {
begin: 310
end: 311
}
ranges {
begin: 311
end: 312
}
ranges {
begin: 312
end: 313
}
ranges {
begin: 314
end: 315
}
ranges {
begin: 317
end: 318
}
ranges {
begin: 318
end: 319
}
ranges {
begin: 321
end: 322
}
ranges {
begin: 323
end: 324
}
ranges {
begin: 324
end: 325
}
ranges {
begin: 332
end: 333
}
ranges {
begin: 334
end: 335
}
ranges {
begin: 336
end: 337
}
ranges {
begin: 338
end: 339
}
ranges {
begin: 356
end: 357
}
ranges {
begin: 361
end: 362
}
ranges {
begin: 362
end: 363
}
ranges {
begin: 365
end: 366
}
ranges {
begin: 373
end: 374
}
ranges {
begin: 374
end: 375
}
ranges {
begin: 375
end: 376
}
ranges {
begin: 377
end: 378
}
ranges {
begin: 378
end: 379
}
ranges {
begin: 379
end: 380
}
ranges {
begin: 380
end: 381
}
ranges {
begin: 381
end: 382
}
ranges {
begin: 382
end: 383
}
ranges {
begin: 384
end: 385
}
ranges {
begin: 385
end: 386
}
ranges {
begin: 386
end: 387
}
ranges {
begin: 387
end: 388
}
ranges {
begin: 388
end: 389
}
ranges {
begin: 389
end: 390
}
ranges {
begin: 390
end: 391
}
ranges {
begin: 397
end: 398
}
ranges {
begin: 398
end: 399
}
ranges {
begin: 400
end: 401
}
ranges {
begin: 402
end: 403
}
ranges {
begin: 405
end: 406
}
ranges {
begin: 409
end: 410
}
ranges {
begin: 415
end: 416
}
ranges {
begin: 454
end: 455
}
ranges {
begin: 466
end: 467
}
ranges {
begin: 468
end: 469
}
ranges {
begin: 469
end: 470
}
ranges {
begin: 470
end: 471
}
ranges {
begin: 471
end: 472
}
ranges {
begin: 472
end: 473
}
ranges {
begin: 473
end: 474
}
ranges {
begin: 474
end: 475
}
ranges {
begin: 475
end: 476
}
ranges {
begin: 476
end: 477
}
ranges {
begin: 477
end: 478
}
combine_outputs: true
}
}
}
node {
calculator: "LandmarksToTensorCalculator"
input_stream: "IMAGE_SIZE:__stream_0"
input_stream: "NORM_LANDMARKS:__stream_4"
output_stream: "TENSORS:face_tensor_in"
options {
[mediapipe.LandmarksToTensorCalculatorOptions.ext] {
attributes: X
attributes: Y
flatten: false
}
}
}
node {
calculator: "InferenceCalculatorCpu"
input_stream: "TENSORS:face_tensor_in"
output_stream: "TENSORS:face_tensor_out"
options {
[mediapipe.InferenceCalculatorOptions.ext] {
model_path: "./mediapipe/modules/face_landmark/face_blendshapes.tflite"
}
}
}
node {
calculator: "SplitTensorVectorCalculator"
input_stream: "face_tensor_out"
output_stream: "__stream_5"
options {
[mediapipe.SplitVectorCalculatorOptions.ext] {
ranges {
begin: 0
end: 1
}
combine_outputs: true
}
}
}
node {
calculator: "TensorsToClassificationCalculator"
input_stream: "TENSORS:__stream_5"
output_stream: "CLASSIFICATIONS:face_blendshapes"
options {
[mediapipe.TensorsToClassificationCalculatorOptions.ext] {
min_score_threshold: -1
top_k: 0
label_map {
entries {
id: 0
label: "_neutral"
}
entries {
id: 1
label: "browDownLeft"
}
entries {
id: 2
label: "browDownRight"
}
entries {
id: 3
label: "browInnerUp"
}
entries {
id: 4
label: "browOuterUpLeft"
}
entries {
id: 5
label: "browOuterUpRight"
}
entries {
id: 6
label: "cheekPuff"
}
entries {
id: 7
label: "cheekSquintLeft"
}
entries {
id: 8
label: "cheekSquintRight"
}
entries {
id: 9
label: "eyeBlinkLeft"
}
entries {
id: 10
label: "eyeBlinkRight"
}
entries {
id: 11
label: "eyeLookDownLeft"
}
entries {
id: 12
label: "eyeLookDownRight"
}
entries {
id: 13
label: "eyeLookInLeft"
}
entries {
id: 14
label: "eyeLookInRight"
}
entries {
id: 15
label: "eyeLookOutLeft"
}
entries {
id: 16
label: "eyeLookOutRight"
}
entries {
id: 17
label: "eyeLookUpLeft"
}
entries {
id: 18
label: "eyeLookUpRight"
}
entries {
id: 19
label: "eyeSquintLeft"
}
entries {
id: 20
label: "eyeSquintRight"
}
entries {
id: 21
label: "eyeWideLeft"
}
entries {
id: 22
label: "eyeWideRight"
}
entries {
id: 23
label: "jawForward"
}
entries {
id: 24
label: "jawLeft"
}
entries {
id: 25
label: "jawOpen"
}
entries {
id: 26
label: "jawRight"
}
entries {
id: 27
label: "mouthClose"
}
entries {
id: 28
label: "mouthDimpleLeft"
}
entries {
id: 29
label: "mouthDimpleRight"
}
entries {
id: 30
label: "mouthFrownLeft"
}
entries {
id: 31
label: "mouthFrownRight"
}
entries {
id: 32
label: "mouthFunnel"
}
entries {
id: 33
label: "mouthLeft"
}
entries {
id: 34
label: "mouthLowerDownLeft"
}
entries {
id: 35
label: "mouthLowerDownRight"
}
entries {
id: 36
label: "mouthPressLeft"
}
entries {
id: 37
label: "mouthPressRight"
}
entries {
id: 38
label: "mouthPucker"
}
entries {
id: 39
label: "mouthRight"
}
entries {
id: 40
label: "mouthRollLower"
}
entries {
id: 41
label: "mouthRollUpper"
}
entries {
id: 42
label: "mouthShrugLower"
}
entries {
id: 43
label: "mouthShrugUpper"
}
entries {
id: 44
label: "mouthSmileLeft"
}
entries {
id: 45
label: "mouthSmileRight"
}
entries {
id: 46
label: "mouthStretchLeft"
}
entries {
id: 47
label: "mouthStretchRight"
}
entries {
id: 48
label: "mouthUpperUpLeft"
}
entries {
id: 49
label: "mouthUpperUpRight"
}
entries {
id: 50
label: "noseSneerLeft"
}
entries {
id: 51
label: "noseSneerRight"
}
}
}
}
}
node {
calculator: "HolisticTrackingToRenderData"
input_stream: "FACE_LANDMARKS:face_landmarks"
input_stream: "IMAGE_SIZE:__stream_0"
input_stream: "LEFT_HAND_LANDMARKS:left_hand_landmarks"
input_stream: "POSE_LANDMARKS:pose_landmarks"
input_stream: "POSE_ROI:pose_roi"
input_stream: "RIGHT_HAND_LANDMARKS:right_hand_landmarks"
output_stream: "RENDER_DATA_VECTOR:__stream_6"
}
node {
calculator: "AnnotationOverlayCalculator"
input_stream: "IMAGE:transformed_image"
input_stream: "VECTOR:__stream_6"
output_stream: "IMAGE:output_video"
}
It has to be said that most of the BlendShapes work well, and the problems mainly appear in the eyes. My program calibrated bs before running mediapipe.
I don't want to compare mediapipe to other things, but the same code will work fine with Nvidia AR.
@endink hi mate, I am writing a new updated plugin for feeding iOS FaceMesh landmarks in Unity from one I did a while ago, the update being the new data that can now come through, i.e the transformation matrix and blendshapes. Is it worth taking the same approach as what you have done and manually pushing it into a legacy graph or do you think its better to wait on these new solutions?
@creativewax Hi,I think you'd better wait google, I have done legacy solution (in Unreal Engine), but its not good enough. I also done new solution in Android, and push to unreal engine app, it's a little bit of an improvement over the legacy solution, but it's still not good enough to drive a 3D avatar.
@endink thank you, I have explained this to my client, but they want to push ahead with the update anyway, I started putting together a new legacy graph last night, but I have not looked at the supporting bits around that. That is the main thin I was unsure about looking at the graph you put above, what is the corresponding code around it. Does the Classification BS just get passed through?
@creativewax You can use my graph above, but you need dowload face_blendshapes.tflite file. BS is in ClassificationList. Label is bs name , value is bs weight (0-1) ,I saw google release a new version for mediapipe(0.9.3), I have not try it. A google developer let me try on another issue. I think you can try it , use C++ task api is a better choise(task api is not worked 0.9.2).
BTW, I still think it's a non-production-ready solution and its effect is almost unusable because the eye part is not handled well. For a production application,you can consider NVIDIA AR SDK, which has similar effects to Apple's ARKit.
@endink cool, ill take a look, basically all I want to do is use the same Face Mesh solution and pull the BS data into it, so atm I am unsure where to reference the face_blendshapes.tflite atm, but I saw that in the graph, so ill dig a bit deeper. I am guessing the ClassificationList. Label is exposed into the legacy FaceMesh solution?
@endink yup, I told my client about it not being production ready and they was still really keen on getting it in, no idea why
Just use my graph, and observe/poll "face_blendshapes" stream. 😄
If you want to know how mediapipe bs actually works, I recorded a video on my youtube channel: https://youtube.com/playlist?list=PL1Bnbwb6xrM-WhM0XgK6mSMpkTL73kmln
At 33:20
@endink nice, I have had a look, but thats solely on the UE side of things, I need to understand how they stream into the iOS side of things, ill take a proper look later, thanks dude
Hi, the updated face landmarker and face blendshapes are now ready and publicly released.
you can try the demo here: https://mediapipe-studio.webapps.google.com/demo/face_landmarker. You can set the Demo num faces
to 1 to see a smoother result.
and the guide here: https://developers.google.com/mediapipe/solutions/vision/face_landmarker
hope this help!
@yichunk Why C++ was abandoned?
It is not abandoned. There is C++ API for mediapipe Tasks https://github.com/google/mediapipe/blob/master/mediapipe/tasks/cc/vision/face_landmarker/face_landmarker.h You can still use mediapipe graph to build your own pipeline.
@yichunk
Thanks for the reply!
Won't there be C++ documentation and code examples in the future? Is it that C++ users can only get the usage method by reading the source code
I need C++ documentation and code examples, too.
@creativewax Hi,I think you'd better wait google, I have done legacy solution (in Unreal Engine), but its not good enough. I also done new solution in Android, and push to unreal engine app, it's a little bit of an improvement over the legacy solution, but it's still not good enough to drive a 3D avatar.
@yichunk facing same performance issue on even high end android devices! Is there any workaround or it's just a dead end?
Is anyone able to crack this one?
@emphaticaditya i have it working on iOS and Android now. I can post my graph and solution later when I am at my computer.
It seems that Google will no longer respond to this issue, and the only can do is mix the new TaskAPI, the old solution used to do Holistic, and the TaskAPI specifically handles Face.
One image send to two pipes. It is not much difference in performance. That's how I use it at this moment.
@endink @creativewax thanks for responding!! I tried using blendshapes in android and it is extremely gittery and laggy. Even had worse performance than web version running on M1. Same code I ran on iPhone and got a much better response. I am suspecting if the only bottleneck is hardware?
I use android too, I got 30FPS (java sdk)
@endink is it possible for you to share a screen recording by any chance?
@emphaticaditya
it can be download :
https://github.com/endink/Mediapipe4u-plugin/releases/tag/M4U_Remoting_App
@creativewax Can you share the performance benchmarks? I am getting 30fps fixed with 33.8ms latency on android s10.
Can anyone comment on the same regarding the performance?
It would be helpful really! Thanks
@emphaticaditya sorry for the delay, them benchmarks seem inline with what I got, and mine was piping into Unity as well, ran pretty well in the end after code optimisations etc
There seems to be a new Mediapipe blendshape prediction model on the horizon that looks very promising. A Google Research Paper describing the new model was released in September 2023 and I have written an issue about it, linking the research paper and asking for a timeline of the new model's implementation.
If this is still relevant you, please upvote the issue so that the blendshape topic gains more exposure. https://github.com/google/mediapipe/issues/5329
Thanks and greetings, Ferdinand
I also have an error in getting Unreal Blendshape calculator.
Currently, the following error occurs in the Unreal Output Log in the part of the pbtxt(endink's holistic pbtxt)in LandmarksToTensorCalculator calculator [LogMediaPipe:Error:; mediapipe::ParseTextProto
I referenced the source code of FaceBlendShapesGraph and put it into a "legacy" solution in the same way (because TaskRunner didn't work), I used the TensorsToClassificationCalculator and face_blendshapes.tflite file to caculate the ClassificationList and succeeded. After this, I brought it into Unreal Engine.
There are some things that I don't quite understand, and I hope to be able to give guidance:
In fact, the left and right eyes of the characters in these pictures are almost the same size.
Here is a part of the graph code I used to render annotation overlay: