kennymckormick / pyskl

A toolbox for skeleton-based action recognition.
Apache License 2.0
898 stars 174 forks source link

How to create 3D custom dataset? #239

Open seekFire opened 3 months ago

seekFire commented 3 months ago

I noticed the python script(tools/data/custom_2d_skeleton.py) for creating 2D custom data, so how to create 3D dataset? or what script tools should we use?

HoBeom commented 3 months ago

The experiments were conducted in 2D coordinates, and the code isn't optimized for 3D. However, I have experience adapting it for 3D applications. By converting 25 3-dimensional keypoints into xy, xz, yz planes, transforming them into 75 channels, I was able to achieve high performance. For creating a 3D dataset, you would follow pyskl approach: collecte 3D keypoints for your data(video), then convert these points into 75 keypoints format by projecting them onto the xy, xz, and yz planes.

seekFire commented 3 months ago

@HoBeom Thank you for your reply! So the 75-keypoints format is shown as [x0, y0, 0, x0, 0, z0, 0, y0, z0, x1, y1, 0, x1, 0, z1, 0, y1, z1, ... ...]. Is it right? BTW, there has some fields defined in tools/data/custom_2d_skeleton.py that the custom 2D data should contains, such as img_shape, total_frames, num_person_raw, does custom 3D dataset need to follow these settings?

HoBeom commented 3 months ago

@seekFire 3D coordinates as [[x0, y0, z0], [x1, y1, z0] ... [xn, yn, zn]] are trasnform in 2D as [[x0, y0]... [ xn, yn], [x0, z0]...[xn, zn], [y0, z0]...[yn, zn]]. Add confidence score into (x, y, c) form and reshaped to (75, 3). img_shape: You should be adjust to use if you use RGB together.(PoseCompact? transform), total_frames : frame length of the video are must correct for sampling. num_person_raw: it defines how many people in the video and set the maximum number when building a dataset.

seekFire commented 3 months ago

@HoBeom Got it! Thank you for your detailed response! Thank you very much!

dvskabangira commented 2 months ago

Hello, have you managed to customize the videos dataset into into 3D?? I have tried but it seems not to work

HoBeom commented 2 months ago

@dvskabangira The methods mentioned above are literally “tricks”. No paper has yet discussed the results of experiments by projecting 3D coordinates into 2D heatmap representation. However, I achieved high performance in my experiments with projecting SMPL's 25 3D coordinates into 2D.

seekFire commented 2 months ago

@HoBeom I think something may need to be improved. You know, there has two fields defined in tools/data/custom_2d_skeleton.py —— keypoint & keypoint_score. So I think the fields keypoint should be (1, T, 75, 2), and fields keypoint_score should be (1, T, 75). The two parts should not be concatenated.

dvskabangira commented 2 months ago

img I was reading through his paper on working with 3D skeleton in poseconv3d, They also used pre-defined 3D skeleton dataset which was also preojected to 2D . i have already pre-processed my custom rgb videos into split values format for posec3d but the problem iam getting is converting them into 3D instead of 2D.

HoBeom commented 2 months ago

@dvskabangira The paper mentions that the projection takes place in the 2D image space and that the number of 3D keypoints matches the number of channels, thereby resulting in information loss. The method I described involves tripling the number of channels while projecting onto a 2D heatmap. This is also a different approach compared to using a 3D heatmap (voxel).

dvskabangira commented 2 months ago

@HoBeom Thanks for the detailed explanation, Can i email you if you don't mind?

HoBeom commented 2 months ago

@dvskabangira Yes, you can email me.