Open tucan9389 opened 4 years ago
Hi @tucan9389 how about using ARKit's 3D motion tracking: https://developer.apple.com/documentation/arkit/capturing_body_motion_in_3d ? Or do you think your proposed 3D pose estimation is more accurate?
@sebo361 Thanks for your suggestion. I agree with that If the case prefers more. But I think there is some other side of benefits when using Core ML rather than AR Kit.
@tucan9389 True, these are interesting benefits to find out! However I am more interested on comparing both approaches regarding the inference speed and accuracy of 3D human pose keypoints. Now as the latest iPad Pro (and iPhone 12) have the LIDAR sensor, i expect to have more precision in 3D human motion tracking, but I am note sure if ARKit's 3D motion tracking uses LIDAR Depth API by default - do you have any insights of that?
Maybe I should setup a project to compare ARKit's 3D motion tracking with your proposed CoreML 3D pose estimation.
@sebo361
Also, Apple's 3D body tracking ARBodyTrackingConfiguration
can't work simultaneously with the world tracking ARWorldTrackingConfiguration
. So, unfortunately, you can't use Apple's 3D body tracking if you also need world tracking
https://github.com/tucan9389/PoseEstimation-TFLiteSwift
Here is 3D pose estimation demo by using TFLiteSwift. I implemented softargmax with pure Swift and Accelerate framework. In Numpy or Pytorch(or TF), there support dimension summation to use easily, but in swift there is no softargmax function in multi-dimension matrix(tensor). So the implementation of softargmax is a little bit hard.
lightweight-human-pose-estimation-3d-demo.pytorch
repo: https://github.com/Daniil-Osokin/lightweight-human-pose-estimation-3d-demo.pytorch
Models
Metadata
Parsing Reference