Hi, (i) Has FLEX been tested on real Multi-person 3D Human Pose datasets since the paper only discusses synthetic dataset testing for multi-person setting?
(ii) Also, can the authors comment on the inference speed of the model when 2D keypoints are presented for each views using pretrained backbone like YOLOv7? Which part of the model makes the inference non-real time?
Hi, (i) Has FLEX been tested on real Multi-person 3D Human Pose datasets since the paper only discusses synthetic dataset testing for multi-person setting? (ii) Also, can the authors comment on the inference speed of the model when 2D keypoints are presented for each views using pretrained backbone like YOLOv7? Which part of the model makes the inference non-real time?