perfanalytics / pose2sim

Markerless kinematics with any cameras — From 2D Pose estimation to 3D OpenSim motion
https://perfanalytics.github.io/pose2sim/
BSD 3-Clause "New" or "Revised" License
242 stars 46 forks source link

Synchronous comparison with Vicon and Vicon fullbody model for compared #37

Closed lishuaifu closed 11 months ago

lishuaifu commented 11 months ago

Hello, thank you very much for the development of this project! I have a problem. The frame rate of the device I use is 30 Hz, but the comparison software Vicon I use is 100 Hz. At present, I want to compare the two of them simultaneously, but I have not figured out how to synchronize the camera and Vicon. On the other hand, how to solve the frame rate gap? Besides,vicon When I use full-body points, is there a marker point model corresponding to a certain osim model? Looking forward to your reply!

davidpagnon commented 11 months ago

Hello, This is not a question exactly related to Pose2Sim but I'll try to answer.

lishuaifu commented 11 months ago

Thanks for your answer. I may not have expressed it clearly. What I want to ask is how to align the markerless motion capture system with the first frame of Vicon to see the effect of the system, because I want to compare the effects of the two systems.

davidpagnon commented 11 months ago

Okay! Did I answer your question then? You need one of the systems to have a framerate that is a multiple of the other, and easiest to synchronize the start of the capture is to use a flashlight.

davidpagnon commented 11 months ago

Hi, sorry I noticed I was not fully exhaustive: if you want to resample your data, OpenSim API provides a way with resampleWithFrequency. If you have OpenSim results you can use it as is, and if not there are still 2 ways:

Doc here Code here

lishuaifu commented 11 months ago

Thanks for your answer, but I'm still not familiar with moco. Recently, after I downloaded the Opencap video and used pose2sim to convert its camera parameters, I found that triangulation could not be performed. image

davidpagnon commented 11 months ago

I haven't tried the Opencap video yet. So just to narrow it down:

lishuaifu commented 11 months ago
  1. This is the pickle file of one of the perspectives I downloaded from Opencap,image
  2. I converted it into .toml format through the conversion you gave image
  3. This is the result I produced after using mediapipe---Cam1 image
  4. This is from the Cam1 perspective{"version": 1.3, "people": [{"person_id": [-1], "pose_keypoints_2d": [86.36988759040833, 276.8182945251465, 0.9999110698699951, 86.83027803897858, 274.3397903442383, 0.9997742772102356, 87.39575743675232, 274.13822174072266, 0.9997487664222717, 87.96467542648315, 273.98975372314453, 0.9997712969779968, 84.92967009544373, 274.74292755126953, 0.9998959302902222, 84.07066583633423, 274.79089736938477, 0.999934196472168, 83.22095274925232, 274.8215103149414, 0.9999582767486572, 88.42557549476624, 274.553165435791, 0.9997041821479797, 82.07850873470306, 275.4034423828125, 0.9998224377632141, 87.53754436969757, 278.82022857666016, 0.999832272529602, 85.50123810768127, 278.9975357055664, 0.9999018907546997, 96.10410690307617, 285.5092239379883, 0.9999629259109497, 74.38771963119507, 287.6607322692871, 0.9999887943267822, 96.98086738586426, 297.832088470459, 0.4311998784542084, 67.27884113788605, 301.8223571777344, 0.8758309483528137, 94.69274997711182, 307.32234954833984, 0.33454710245132446, 71.04289770126343, 305.68288803100586, 0.9474323987960815, 94.18608069419861, 310.08153915405273, 0.32757994532585144, 72.90933430194855, 307.5128173828125, 0.9270080327987671, 94.02599573135376, 309.33008193969727, 0.352166086435318, 73.5474157333374, 305.6673812866211, 0.932270348072052, 93.96407961845398, 308.64917755126953, 0.30547624826431274, 73.10228168964386, 304.80945587158203, 0.8792705535888672, 89.08805966377258, 321.56543731689453, 0.9999736547470093, 76.86519026756287, 322.0858383178711, 0.9999903440475464, 90.17389297485352, 343.1319046020508, 0.9188488125801086, 78.89803111553192, 343.9849090576172, 0.9591719508171082, 83.98832738399506, 345.7379913330078, 0.3281237483024597, 76.9914311170578, 362.576904296875, 0.8790631890296936, 82.42270588874817, 345.6193161010742, 0.21257536113262177, 76.10240757465363, 364.8035430908203, 0.4365009069442749, 82.10011124610901, 355.79593658447266, 0.3827136158943176, 78.55961680412292, 370.17311096191406, 0.8092045783996582], "face_keypoints_2d": [], "hand_left_keypoints_2d": [], "hand_right_keypoints_2d": [], "pose_keypoints_3d": [], "face_keypoints_3d": [], "hand_left_keypoints_3d": [], "hand_right_keypoints_3d": []}]}
  5. This is from the Cam2 perspective {"version": 1.3, "people": [{"person_id": [-1], "pose_keypoints_2d": [334.2913269996643, 292.2208023071289, 0.9896133542060852, 335.00728368759155, 289.6761131286621, 0.9827073216438293, 336.01611614227295, 289.27318572998047, 0.9834198951721191, 336.82562828063965, 289.0037536621094, 0.9829366207122803, 333.3254098892212, 290.14575958251953, 0.987060546875, 332.93503046035767, 290.1839065551758, 0.9909356236457825, 332.4513530731201, 290.2560043334961, 0.9915648698806763, 339.1159129142761, 288.7827491760254, 0.990228533744812, 333.6403441429138, 290.3847122192383, 0.9805418252944946, 336.8967604637146, 293.61583709716797, 0.9945313930511475, 334.52703952789307, 294.11632537841797, 0.9932716488838196, 352.66445875167847, 298.98956298828125, 0.9997182488441467, 329.7544455528259, 301.1568832397461, 0.9993909597396851, 359.918053150177, 309.32424545288086, 0.8497646450996399, 326.2007546424866, 312.7231216430664, 0.4911540150642395, 356.31078243255615, 315.00293731689453, 0.7551651000976562, 325.0827884674072, 318.2554244995117, 0.9167132377624512, 355.55105209350586, 317.78234481811523, 0.7821435928344727, 324.60434675216675, 320.2236557006836, 0.9328354001045227, 354.43212032318115, 316.84642791748047, 0.8032744526863098, 324.7783899307251, 318.48947525024414, 0.9406012892723083, 354.005606174469, 315.9907531738281, 0.6747942566871643, 325.6250023841858, 317.775821685791, 0.8890450596809387, 348.1209468841553, 334.1305923461914, 0.9986364245414734, 335.98388671875, 335.79620361328125, 0.9987885355949402, 346.6110134124756, 357.8704071044922, 0.7729591727256775, 343.7102150917053, 355.9614562988281, 0.4121052026748657, 352.4228239059448, 379.4289779663086, 0.829698383808136, 350.63040018081665, 374.89830017089844, 0.4074961543083191, 353.99446964263916, 382.34039306640625, 0.5238143801689148, 352.4765968322754, 376.91001892089844, 0.30202439427375793, 351.7030692100525, 390.12805938720703, 0.8223171830177307, 351.3906455039978, 385.3087615966797, 0.4258943796157837], "face_keypoints_2d": [], "hand_left_keypoints_2d": [], "hand_right_keypoints_2d": [], "pose_keypoints_3d": [], "face_keypoints_3d": [], "hand_left_keypoints_3d": [], "hand_right_keypoints_3d": []}]}
lishuaifu commented 11 months ago
  1. This is the output image after using BODY_25 image 2.This is from the Cam1 perspective {"version":1.3,"people":[{"person_id":[-1],"pose_keypoints_2d":[86.5385,276.059,0.899345,85.6947,287.349,0.898471,74.4123,288.179,0.869376,66.5801,300.396,0.887186,72.678,305.619,0.862153,94.363,285.624,0.889447,94.395,296.09,0.756386,94.3679,305.614,0.693331,83.9581,316.913,0.841355,78.7279,316.913,0.861515,79.5661,342.103,0.852413,76.1274,362.961,0.791949,89.1568,316.912,0.851444,89.1609,339.523,0.825081,83.0748,351.7,0.821959,83.9589,275.149,0.888931,87.4087,274.343,0.940536,80.4426,275.194,0.924241,90.0286,274.35,0.654025,86.531,358.661,0.714324,87.3877,357.756,0.780514,80.4963,352.551,0.61264,80.4322,367.315,0.823812,75.2782,366.478,0.753778,76.1093,363.841,0.746649],"face_keypoints_2d":[],"hand_left_keypoints_2d":[],"hand_right_keypoints_2d":[],"pose_keypoints_3d":[],"face_keypoints_3d":[],"hand_left_keypoints_3d":[],"hand_right_keypoints_3d":[]}]}
  2. This is from the Cam2 perspective {"version":1.3,"people":[{"person_id":[-1],"pose_keypoints_2d":[333.435,289.136,0.867444,338.631,300.406,0.860339,329.091,302.12,0.925741,323.895,316.028,0.823386,324.73,316.915,0.676359,350.772,296.951,0.879012,359.522,305.607,0.725393,359.512,314.305,0.79303,343.851,331.684,0.879776,336.93,332.565,0.838547,346.457,356.865,0.717124,354.305,374.276,0.446781,350.758,330.835,0.873439,346.469,356.017,0.774897,351.64,378.614,0.799196,331.667,288.26,0.91,336.027,288.229,0.927251,330.822,289.959,0.0895358,339.525,288.23,0.862552,349.93,386.453,0.761905,351.717,386.428,0.74174,351.722,379.518,0.631915,352.564,384.675,0.150608,0,0,0,359.514,375.161,0.308],"face_keypoints_2d":[],"hand_left_keypoints_2d":[],"hand_right_keypoints_2d":[],"pose_keypoints_3d":[],"face_keypoints_3d":[],"hand_left_keypoints_3d":[],"hand_right_keypoints_3d":[]}]}
davidpagnon commented 11 months ago

Just to make sure this is not where the issue comes from, are you quite far from the camera in the first frame? You're like 10 px wide and 80 in height.

Try to set interpolation to 'none', and to start your frame_range a little later. If it still does not work, increase your reproj_error_threshold as much as you need (like 100 px). However, with only 2 cameras and a subject that far away, you may not get very nice results.

Pose2Sim gives a flexible and powerful tool to obtain 3D kinematics, but it uses mathematics instead of statistics. To make it clearer, it optimizes triangulation based on a set of a few carefully crafted rules; while OpenCap uses a simple calibration procedure, but guesses missing markers and additional ones based on machine learning. This makes it pretty impressive in a lot of cases, although it may not work as well on movements it has not been trained on, and it is not easily adaptable to other models (such hand model or animal ones).

davidpagnon commented 11 months ago

So did you manage to solve your issue?