MikeOfZen / Yet-Another-Openpose-Implementation

This project reimplements from scratch the OpenPose paper (Cao et al,2018), Using Tensorflow 2.1 and optional TPU powered training.
Mozilla Public License 2.0
92 stars 26 forks source link

Evaluation on COCO Dataset #5

Closed carlosh93 closed 3 years ago

carlosh93 commented 4 years ago

Hello! Thanks again for your work and this project. I have used your code on my project and obtaining good visual results, but now I need to perform a quantitative evaluation of your implementation on the COCO dataset. I noticed that you did not provide an evaluation script, then I tried to use the evaluation script of the tf-pose-estimation. However, I have problems by using this script because the output of your openpose implementation differs on the output of the net provided in the tf-pose-estimation project. Specifically, the keypoints and pafs of your network have the size of (46, 46, 18) and (46,46, 34) respectively, while the keypoints and pafs of the other project have the size of (46,46,19) and (46,46,38), respectively. Why did you configure your network to output such size? Most of the openpose implementations output the same size as in the tf-pose-estimation project.

In summary, I would like to know if it is possible to use the kpts and pafs provided by your implementation in the evaluation script of the tf-pose-estimation project in order to evaluate the performance on the COCO dataset. I also have tried to use the scripts in the post_processing module to obtain the skeletons and make the evaluation, but I notice that the scripts sometimes failed to provide a unique skeleton, i.e., sometimes there is only one person in an image, and the scripts give two skeletons: one with the body parts and the other one with the face parts. Thank you in advance!

MikeOfZen commented 4 years ago

Hi Carlos, glad to hear you found the project useful. I indeed didn't make a separate eval script, instead, relying on evaluation during the training stage. the kpts&pafs indeed differ from the tf-openpose, they used the kpts directly from COCO dataset, without alteration, the resulting skeleton connects the ears directly to the shoulders without using the neck/torso. my implementation creates a new neck point (by measuring half-way between the shoulders), to which the nose and shoulders connect, besides being more representative anatomically, it seems to work better in regards to recognition. and the number of pafs stems from the number of connections between the kpts (1 one x and for y direction) *the 2nd (later) original openpose implementation uses the same skeleton structure.

you could probably do the direct comparison with tf-pose-estimation, if you trim the 19 to 18 dimension, and same for pafs. and just ignore these points, as the main difference is how the neck connects to the shoulders. this would take some fiddling with the tensor though to make sure you cut out the right part. you can check out 'configs/keypoints_config.py' for the skeleton definition.

Regarding post-processing, yeah, stitching the skeletons back right is one of the main challenges of this domain. for your example, for instance, you can't be sure if the skeleton parts are indeed one person which is somewhat obscured or 2 people in the same photo. the post-processing can definitely be improved further, my code is just an initial viable implementation under ideal conditions. For more accurate results, there are a few thresholds you can play around with (under post_config.py). and if you have the time you can update the post-processor algo.

one major change which I would do if I had the time, is incorporate redundant pafs, this should make the post-processor much more stable. but I'm not actively working on this project.