shurans / sscnet

Semantic Scene Completion from a Single Depth Image
http://sscnet.cs.princeton.edu/
340 stars 91 forks source link

Problem for test with own dataset #11

Closed yxliwhu closed 7 years ago

yxliwhu commented 7 years ago

@shurans Hello, I have got the test result with NYU data set. And I want to test with my own data set, but I do not know how to generate .bin file from the depth data. Could you help me? Thanks and regards.

Fromandto commented 7 years ago

'Example code to convert NYU ground truth data: matlab_code/perpareNYUCADdata.m This function provides an example of how to convert the NYU ground truth from 3D CAD model annotations provided by: Guo, Ruiqi, Chuhang Zou, and Derek Hoiem. "Predicting complete 3d models of indoor scenes." You need to download the original annotations by runing download_UIUCCAD.sh.'

yxliwhu commented 7 years ago

@Fromandto Thanks for reply, but I think you misunderstand my question. I want to test with the data I get myself, what means I only have the depth information (.png). How could I get the .bin file used in the test.

yxliwhu commented 7 years ago

@shurans could you provide some advice about this problem?

Fromandto commented 7 years ago

https://github.com/arron2003/rgbd2full3d

yxliwhu commented 7 years ago

@Fromandto In my opinion, the link is the method to get the ground truth data set, but not the way to get the '.bin' file from the '.png' file. Maybe there are some other solution. From the paper, the '.bin' store the information of "3D volume: First three float stores the origin of the 3D volume in world coordinate. Then 16 float of camera pose in world coordinate. Followed by the 3D volume encoded by run-length encoding. Please refer to ./matlab_code/utils/readRLEfile.m for more details." what means here is the result of TSDF, so maybe here would be the solution. I am not sure. @andyzeng Please give some advice.

Fromandto commented 7 years ago

.bin stores gt suncg_data_layer.cu converts .png into tsdf

shurans commented 7 years ago

Yes .bin stores rotation matrix and ground truth volumne, suncg_data_layer.cu converts .png into tsdf. You can reference genSUNCGdataScript, perpareNYUCADdata , see how the volume is generated for the SUNCG or NYU dataset.

yxliwhu commented 7 years ago

@shurans @Fromandto That means, if I want to test my own data, I should get the ground truth of it firstly? Because the rotation matrix was stored in the .bin file. This point confuse me, if I have the ground truth result, why should use the application to prediction? Should I write some other code for prediction only?

Fromandto commented 7 years ago

For a large-scale evalution on your dataset, a workaround is to add fake gt to your .bin file (like a all-zero tensor).

yxliwhu commented 7 years ago

@Fromandto I got it. what you mean is something like this: if I want to test my own large-scale dataset. Firstly, I should use rgbd2full3d and fake gt to generate full 3D dataset. Secondly, with 3D dataset and perpareNYUCADdata to generate .bin files. Finally, run the test program to get the result. Is it right?

Fromandto commented 7 years ago

I think you can skip the first step. just use all-zero tensors in corresponding lines in prepareNYUCADdata

yxliwhu commented 7 years ago

@Fromandto I check the perpareNYUCADdata code and find the value of 'camPoseArr' and 'voxOriginWorld' are calculated from 'extCam2World'. At the same time the value of 'extCam2World' depends on the value of 'floorHeight'. Unfortunately, the 'floorHeight' is calculated from the result of "floorHeight = model.objects{floorId}.model.surfaces{1}.polygon.pts{1}.y;". It from the rgbd2full3d. I don't know whether I get the point.

Fromandto commented 7 years ago

maybe some (reasonable) fake values for these extrinsic camera parameters will work. parameters that can fit you .png into a 240x144x240 volume.

yxliwhu commented 7 years ago

@Fromandto Sorry, I don't know what you mean. Refer to suncg_data_layer.cu, compute TSDF use the information of camera parameters. And I test the different camera parameters for same .png file and got different .ply result. what could I do?

Fromandto commented 7 years ago

yes, you may got different .ply results, but I think an important fact is that by doing so you are in control of this (up to some translation, rotation, etc.). I think you can just crop out what you need in the output volume (according to the fake parameters).

shurans commented 7 years ago

The extrinsic for the SUNCG and NYU dataset come with the data. To obtain extrinsic for a new dataset you can try to use this function rectifyScene.m.zip which estimates the rotation matrix and floor height based on Manhattan assumption.

yxliwhu commented 7 years ago

@shurans How could I generate "*_vol_d4.mat" file for my own dataset.

shurans commented 7 years ago

Depends on your dataset, how the ground truth is represented, you will need to write your code to generate this file. This file used for evaluation only, and you need to have ground truth room definition to generate this file (for the cases of the outside room, outside ceiling). Each of the voxels represents one of the following categories: on the surface: 0 free space :1 occluded -1 missing Depth : -2 outside FOV: -3 outSide room: -4 outside ceiling: -5

yxliwhu commented 7 years ago

@shurans Could you share the code used to generate ''*_vol_d4.mat" for NYU dataset, Thanks.

KnightOfTheMoonlight commented 6 years ago

@yxliwhu I think the function below should be useful for label downsample. https://github.com/shurans/sscnet/blob/61b4060b435fa93798ea1ced2b87c55bee2433bc/caffe_code/caffe3d_suncg/src/caffe/layers/suncg_util.hpp#L409

mgarbade commented 6 years ago

@KnightOfTheMoonlight Unfortunately the masks defined in *_vol_d4.mat are not equivalent to the labels output by the DownsampleLabel function. As can be seen from these images. The left image shows labels (blue) + voxels that are only part of the semantic evaluation (yellow). The right image shows the mask for the evaluation of scene completion. Again yellow voxels show the additional voxels used during evaluation. The middle image is displaying the output of DownsampleLabel function.

screenshot from 2018-03-08 11 39 49

KnightOfTheMoonlight commented 6 years ago

@mgarbade I find the same label issues from evaluation_script this code https://github.com/shurans/sscnet/blob/61b4060b435fa93798ea1ced2b87c55bee2433bc/matlab_code/evaluation_script.m#L67 nonfree_voxels_to_evaluate = abs(vol)<1|vol==-1; and this code means sscnet only evaluate the voxels on the surface and the occluded space.

However, as I suppose, this model should evaluate all the voxels neither out of view nor ivalid depth. so I suppose the to evaluate voxel should be: nonfree_voxels_to_evaluate = ~(sceneVox==255|isnan(sceneVox));

And as far as I understand, if we wanna follow sscnet's evaluation strategy, test data can only be available if we know the ground truth first, which includes the surface and occluded space voxels.