MiracleDance / PoseRAC

PoseRAC: Pose Saliency Transformer for Repetitive Action Counting
MIT License
12 stars 2 forks source link

The order of the action recognition and salient pose recognition, how to recognize different salient pose of different action #10

Closed DanielC-MST closed 1 year ago

DanielC-MST commented 1 year ago

When recognition, did authors conduct action recognition first and then, based on that, use an action-specific model to recognize the salient poses? Or is it an action-agonistic model training on a mixture of all 8 actions? If so, how to separate the salient pose recognition of different actions since the poses are very different from different actions?

MiracleDance commented 1 year ago

We directly classify salient poses.

In terms of video-level action category, we can have two options: (1) artificially given the action categories that need to be counted (2) count the most frequently occurring action category in the entire video

DanielC-MST commented 1 year ago

Thanks. Do you happen to mean that for a new input video, the model will extract the skeleton frame by frame and compare the skeleton similarity between the frame skeleton with the salient post to determine the action type? What if, in a long video, there are two different action types (both belong to the 8 actions)? Does the model need to detect at least one salient post of a specific action to know a new action is happening?

On Wed, May 31, 2023 at 8:25 AM Ziyu Yao @.***> wrote:

We directly classify salient poses.

In terms of video-level action category, we can have two options: (1) artificially given the action categories that need to be counted (2) count the most frequently occurring action category in the entire video

— Reply to this email directly, view it on GitHub https://github.com/MiracleDance/PoseRAC/issues/10#issuecomment-1570231761, or unsubscribe https://github.com/notifications/unsubscribe-auth/AOXUHQT4NLYH7ZNYOXERCHDXI5BEBANCNFSM6AAAAAAYUPMFOU . You are receiving this because you authored the thread.Message ID: @.***>

MiracleDance commented 1 year ago

At present, the datasets in the field of repetitive action counting, only one action category would be appeared in a video. We have not done experiments with multiple action categories in the video yet.

But our method may solve it, just classify each frame. Each frame shall have category: what action and which salient pose. As long as two salient poses of one action appear sequentially, that category of action will add one count.

DanielC-MST commented 1 year ago

Thanks. I understand. Based on your information, is that possible that in a video of the jumping jack, the action could be recognized as another action at the beginning because of the wrong recognition of the salient poses/ skeleton detection issue? Then the cam view changes and the salient pose is correctly detected. In this case, do you know if the action recognition will be corrected? or will it be the first/initially detected action type once the first pair of two salient poses are detected?

On Wed, May 31, 2023 at 9:58 PM Ziyu Yao @.***> wrote:

At present, the datasets in the field of repetitive action counting, only one action would be appeared in a video. We have not done experiments with multiple action categories in the video yet.

But our method may solve it, just classify each frame. Each frame shall have category: what action and which salient pose. As long as two salient poses of one action appear sequentially, that category of action will add one count.

— Reply to this email directly, view it on GitHub https://github.com/MiracleDance/PoseRAC/issues/10#issuecomment-1571252795, or unsubscribe https://github.com/notifications/unsubscribe-auth/AOXUHQVZDKZPNPV4W6AW6MTXJAAMVANCNFSM6AAAAAAYUPMFOU . You are receiving this because you authored the thread.Message ID: @.***>

MiracleDance commented 1 year ago

Even if the wrong category is detected at the beginning, when the prediction results of the following frames are the salient poses of the jumping jump, the action with the most occurrences in the entire video will still be jumping jack, so the action category that may be wrongly detected at the beginning will be ignored.