AlessioSam / CHICO-PoseForecasting

Repository for "Pose Forecasting in Industrial Human-Robot Collaboration" (ECCV 2022)
31 stars 5 forks source link

dataset use for action recognition #4

Closed yuchen-ji closed 1 year ago

yuchen-ji commented 2 years ago

Hello, I just dabbled in this field, and I want to ask you some questions.

  1. can your dataset support action recognition in human-computer collaboration scenarios?
  2. I know some skeleton-based method for action recognition, Is your dataset compatible with smpl-x representation? Thank you so much!
federicocunico commented 2 years ago

Hello,

  1. yes, the dataset provides skeleton and RGB information divided by "actions". Therefore each action is a label that you may use for action recognition, using methods that use skeleton data or RGB frames, or a combination of both, as you like.
  2. As far as I know, SMPL is based on meshes, which in this case were not acquired. You may try to reconstruct the mesh using blender or other 3D software.
yuchen-ji commented 2 years ago

Thank you so much!

yuchen-ji commented 2 years ago

Hello,

  1. yes, the dataset provides skeleton and RGB information divided by "actions". Therefore each action is a label that you may use for action recognition, using methods that use skeleton data or RGB frames, or a combination of both, as you like.
  2. As far as I know, SMPL is based on meshes, which in this case were not acquired. You may try to reconstruct the mesh using blender or other 3D software.

Hello, I have another questions as follow:

  1. Whether the joints of the human body defined in your dataset could correspond to joints defined in other dataset, such as 3dpw, human3.6.
  2. Could I use some Human keypoints detection algorithm to predict human's joints. Then use these joints (just predict from neural network, not GT) to train a model to recognize aciton. Is the model trained in this approach reliable?
federicocunico commented 1 year ago

Hi,

  1. the keypoints are a modified version of the COCO 2D Keypoints format. See the precise definition here: https://github.com/federicocunico/human-robot-collaboration/blob/master/datasets/chico_dataset.py#L66

    In particular, w.r.t. COCO definition, ears and eyes are not present, and the "hip" joint (the center of the hip) has been added as interpolation of left hip and right hip. Note that the order of the keypoint is different from COCO, as you can see in the link above.

  2. You may definitely try. I do not see any reason to discourage you to try it. Remember, w.r.t. CHICO dataset, that the single acquisitions can be seen as a sort of action in loop. They are not trimmed tho.