zhuhao-nju / hmd

Detailed Human Shape Estimation from a Single Image by Hierarchical Mesh Deformation (CVPR2019 Oral)
MIT License
267 stars 44 forks source link

Kinect dataset #11

Closed mks0601 closed 4 years ago

mks0601 commented 4 years ago

Hi, thanks for sharing code of your paper. Could you share the Kinect depth map dataset used to train Shading-Net?

wangsen1312 commented 4 years ago

We have uploaded it to Baidu Yun. link:https://pan.baidu.com/s/1SVSNsAJ_S0u-YS6ata_IJw code:gu2r

Data structure: gtD_XXXX.bin captured data
inputC_XXX.jpg reference color image smoothD_XXX.bin smoothed data

Simple matlab read demo: fileID = fopen('gtD_XXXX.bin'); img = fread(fileID, [448,448], 'float32'); imagesc(img)

If you have any quesition on dataset, please contact us.

mks0601 commented 4 years ago

Great. Thanks! Unfortunately, I cannot download the data from Baidu because of unknown error :( Could you upload the data at other cloud service such as google drive?

wangsen1312 commented 4 years ago

@mks0601 we have uploaded it to the google drive https://drive.google.com/file/d/1c2oBvBNzbK0E-UeiYvvhHEmDE3w4L0Me/view?usp=sharing

mks0601 commented 4 years ago

Perfect. Thanks! Could you also share estimated depth maps on MSCOCO dataset using ShadingNet?

wangsen1312 commented 4 years ago

@mks0601 We didn't not do any depth estimation on datasets, we use HMR as the initialisation.

mks0601 commented 4 years ago

Also, could you give me some filename rule of the dataset? It seems there are gtD_xxxx.bin, inputC_xxxx.jpg, inputC_xxxx-r.png, inputC_xxxx-s.png, and smoothD_xxxx.bin.

I think gtD_xxxx.bin, inputC_xxxx.jpg, and smoothD_xxxx.bin represent groundtruth sharp depth map, input image, and smooth depth map, respectively. What inputC_xxxx-r.png, inputC_xxxx-s.png stand for?

wangsen1312 commented 4 years ago

inputC_xxxx-r.png and inputC_xxxx-s.png represents the albedo and shading for the color image

mks0601 commented 4 years ago

@mks0601 We didn't not do any depth estimation on datasets, we use HMR as the initialisation.

Hmm.. maybe I misunderstand your paper, but as I understand, you use ShadingNet to get the refined depth map at Figure 2. You first train the ShadingNet on your Kinetics depth map dataset, and the refined depth map from the ShadingNet is used to deform the mesh without detail to have detailed information. What I asked was could you share the refined depth map from the ShadingNet.

In addition, I want to ask you several questions about how did you 'deform' meshes. The dimensions of the deformation vectors from the 'joint deform' and the 'anchor deform' are not described in the paper, which makes me confused.

  1. If the dimension of the 'joint deform' vector is J x 3 (J = the number of joints), only some vertices tied with body joints would be effected by joint deformation vector. Then, it can result in very weird mesh because most of the vertices do not move while only some vertices tied with body joints move. What do you think about this? This question also holds for the 'anchor deform'.

  2. How do you deform mesh using the refined depth map from the ShadingNet? The refined depth map only provides depth information from visible area, but mesh is full 3D that includes invisible area from the image.