microsoft / MeshTransformer

Research code for CVPR 2021 paper "End-to-End Human Pose and Mesh Reconstruction with Transformers"
https://arxiv.org/abs/2012.09760
MIT License
614 stars 95 forks source link

sequence information in TSV data files #13

Closed jkamalu closed 2 years ago

jkamalu commented 3 years ago

Hello,

Thank you for publishing such a complete project for your work. It is very much appreciated!

Is there a way to figure out the correspondence between the person id and the timestep for video datasets? For example, I want to edit the dataloader to randomly sample image crops of the same person in chronological order e.g. [person i @ frame 0 in video k, person i @ frame 1 in video k, person i @ frame 2 in video k, etc. ] where i and k are randomly sampled.

Is this possible given the meta data for 3DPW, Human 3.6 Million, or any of the other video based datasets? Or would I have to generate my own .tsv files? I would appreciate any and all help you can give!

Thanks

kevinlin311tw commented 3 years ago

Thank you for your interests in our work!

Unfortunately, we don't have person id information in our current TSV files. So, there is probably no way to find the correspondence between the person id and the timestep, especially for 3DPW :(

For H3.6M, I think it may be possible to do it.

H3.6M has only a single person in each video. We just need to find timestamp. Timestamp can be estimated based on the image key:
For example, given an image key images/S1_Directions_1.54138969_000001.jpg, the format is images/[subjectID]_[Action]_[CameraID]_[FrameID].jpg

leafxx commented 2 years ago

Thank you for your interests in our work!

Unfortunately, we don't have person id information in our current TSV files. So, there is probably no way to find the correspondence between the person id and the timestep, especially for 3DPW :(

For H3.6M, I think it may be possible to do it.

H3.6M has only a single person in each video. We just need to find timestamp. Timestamp can be estimated based on the image key: For example, given an image key images/S1_Directions_1.54138969_000001.jpg, the format is images/[subjectID]_[Action]_[CameraID]_[FrameID].jpg

Hi Kevin, why the CameraID is "1.54138969", a float value. In H36M cam_idx is a int value.