zju3dv / SMAP

[ECCV 2020] SMAP: Single-Shot Multi-Person Absolute 3D Pose Estimation
Apache License 2.0
241 stars 37 forks source link

Test & Train set of CMU panoptic #24

Closed hongsukchoi closed 3 years ago

hongsukchoi commented 3 years ago

I finally found the detailed configuration for the CMU panoptic. Thanks! https://github.com/zju3dv/SMAP/issues/1#issuecomment-a

However, still it is unclear which images you used for training and testing. The paper says

Following [41], we choose two cameras (16 and 30), 9600 images from four activities (Haggling, Mafia, Ultimatum, Pizza) as our test set, and 160k images from different sequences as our training set.

As far as I know, there are much more image frames than 9600 in the test sequences you listed here https://github.com/zju3dv/SMAP/issues/1#issuecomment-a.

I hope you can tell the specific frame sampling configuration!

raypine commented 3 years ago

For each activity, 2400 frames are uniformly sampled. If there are two sequences from the same activity, 1200 frames of each sequences are sampled. For example, [160226_haggling1] and [160422_haggling2] both provide 1200 frames.

hongsukchoi commented 3 years ago

Could you also tell me what joints did you use for evaluation? Some sequences have 19 coco joint annotations, but some have 15 mpii joint annotations. Did you use different joints per different sequences?

raypine commented 3 years ago

15 mpii joint annotations for all sequences.

hongsukchoi commented 3 years ago

wow I really appreciate your prompt answer!!!

nicolasugrinovic commented 2 years ago

Hi, kudos for the great work!

To have this clear, do you train your model with the 15 mpii joints and therefore with only VGA images and not HD images? In the dataset web page 15 mpii joints are associated with VGA images. image

Also, in case you used HD images. There seems to be a little difference between annotations from 15 mpii vs. coco19 in terms of translation and body rotations. Have you done anything to handle this? image

Thanks in advance!

raypine commented 2 years ago

@nicolasugrinovic We use HD images with mpi15 annotations. The mpi15 annotations are provided by CMU for these videos (download via scripts), at least at that moment. We have provided our conversion code from coco to mpi15 (lib/preprocess/create_annot).

nicolasugrinovic commented 2 years ago

Great, thank you very much for the prompt reply.

nicolasugrinovic commented 2 years ago

Hi, One question regarding the image processing here. When you rescale the images to size 832×512, do you directly rescale them or do you keep the aspect ratio and pad the image?

Would really appreciate your response. Thanks!

Juzezhang commented 2 years ago

Hi, it mentioned that "2400 frames are uniformly sampled". However, not all the number of activity are multiples of 2400. For fair comparion, could provide the index of frame? or Could provide more detail about the sampling rule?