Daniil-Osokin / lightweight-human-pose-estimation-3d-demo.pytorch

Real-time 3D multi-person pose estimation demo in PyTorch. OpenVINO backend can be used for fast inference on CPU.
Apache License 2.0
667 stars 139 forks source link

Issue that does not smoothly show keypoints #72

Closed connernam closed 3 years ago

connernam commented 3 years ago

Hi, @Daniil-Osokin

I have a few questions.

In other repos, there is an args called smooth. I've seen that it actually reduces the jitter of keypoints a lot. But I want to know why there is no smooth in 3D repo. Did you remove the existing smooth method because it was a problem in 3D?

The smooth method was imported and implemented in 3d repo, but it did not show the results of the jitter reduction shown in 2d repo. Compared to before and after the oneeuro filter, the difference seems to be that the filter works well. But I don't know why it doesn't smooth. It's hard to see, but I'll leave some of the 'smooth' code I wrote at the bottom.

Also, I checked the case of extracting keypoints by recognizing objects such as chairs as human beings. In this case, 'Do I have to retrain the model?' or 'Should I use a more powerful backbone than a mobileNet?'

Thanks.


class Pose: def init(self, keypoints, confidence): self.filters = [[OneEuroFilter(), OneEuroFilter(), OneEuroFilter()] for _ in range(Pose.num_kpts)]

    if smooth:
        for kpt_id in range(Pose.num_kpts):
            if current_poses[current_pose_id].keypoints[kpt_id, 0] != -1:
                print('before ', kpt_id, ': ', current_poses[current_pose_id].keypoints[kpt_id, 0], ', ',
                      current_poses[current_pose_id].keypoints[kpt_id, 1])
            if current_poses[current_pose_id].keypoints[kpt_id, 0] == -1:
                continue
            # reuse filter if previous pose has valid filter
            if (best_matched_pose_id is not None
                    and previous_poses[best_matched_id].keypoints[kpt_id, 0] != -1):
                current_poses[current_pose_id].filters[kpt_id] = previous_poses[best_matched_id].filters[kpt_id]
            current_poses[current_pose_id].keypoints[kpt_id, 0] = current_poses[current_pose_id].filters[kpt_id][0]\
                (current_poses[current_pose_id].keypoints[kpt_id, 0])
            current_poses[current_pose_id].keypoints[kpt_id, 1] = current_poses[current_pose_id].filters[kpt_id][1]\
                (current_poses[current_pose_id].keypoints[kpt_id, 1])
            if current_poses[current_pose_id].keypoints[kpt_id, 0] != -1:
                print('after ', kpt_id, ': ', current_poses[current_pose_id].keypoints[kpt_id, 0], ', ',
                      current_poses[current_pose_id].keypoints[kpt_id, 1])
Daniil-Osokin commented 3 years ago

Hi, the purpose of this repository is to show monocular 3D pose estimation, how it works. Extra smoothing was not added to not pollute the code. Regarding this snippet, it looks ok, try to set the same filter parameters in constructor, that are used to filter translation vector. To detect keypoints of other objects classes (not person), you need to train a model for that particular class.

connernam commented 3 years ago

Thank you for leaving an answer.

Do you mean that oneEuroFilter parameters of self.filters should be added or converted in pose class? In neighbor repo, only two One EuroFilters were allowed in self.filters without parameters. I'd appreciate it if you could tell me in more detail.

And I want to detect only human keypoints. However, as shown in the figure below, the existing pre-trained model seems to misidentify objects such as chairs as people.

화면 캡처 2021-05-26 142720

Thanks.

Daniil-Osokin commented 3 years ago

I mean try to pass these parameters while creating filters for keypoints. Actually I can not see much in the image provided, but false positives are possible. You can run object detection network, which detects chairs, and filter predictions which overlap with detected chairs. Also you can use skeletons with confidence value higher than some threshold, to suppress low confident detections on chairs.

connernam commented 3 years ago

Thank you. I'll try as you tell me.

connernam commented 3 years ago

Hi, @Daniil-Osokin

For 2d poses, Instead of using 'poses_2d_scaled', I reduced jitter using 'current_poses_2d'. Now I want to reduce jitter in 3D pose, but it doesn't work as well as I thought. I applied filters to the translation (adjusting beta and freq) in parse_poses.py, but it was hard to achieve noticeable results.

This is the translation value before and after the filter is applied. before : [ 2.46822999 16.20406334 71.93047418] after : [2.4682299941778183, 16.204063341021538, 71.93047417534723]

Is there any other way to reduce jitter in 3D pose?

Attached is a canvas showing the results of 3D poses.

ezgif com-gif-maker

Daniil-Osokin commented 3 years ago

Hi! You can try just simple weighted averaging of new and previous coordinate, such as: (1-alpha)*x_previous + alpha*x_current where alpha is the influence on new coordinate, e.g. 0.2, if one euro filter does not give expected result.

connernam commented 3 years ago

Thanks.

Is it similar to the existing method used for 2d smooth?(in terms of comparing the previous and current states) just like pose.py's propagate_ids method.

If one Eurofilter's LowPassFilter class deals only with previous, does it compare the two by adding current to solve the 3D jitter?

Daniil-Osokin commented 3 years ago

It is the low pass filter, the simpler alternative for the one-euro filter. It has just 1 parameter, which is easy to tune to select smoothness you would like.

connernam commented 3 years ago

In summary, a low-pass filter has only one parameter that represents the previous coordinates.

To reduce jitter in 3D poses, current coordinates are added to the low-pass filter to go through calculations such as: (1-alpha) x_previous + alpha x_current.

As with the code method in the first question, the previous and current values of 3d pose (poses_3d at parse_poses.py) are taken and put into the one euro filter. These values are x_previous and x_current, respectively.

Am I right to understand?

Thank you for always responding kindly.

Daniil-Osokin commented 3 years ago

Looks so. Low-pass filter just weights current coordinate and previous coordinate. If you need more smoothing, use small alpha values.

connernam commented 3 years ago

Thank you. I'll try as you tell me!