Open Roios opened 7 years ago
Hey Roios,
Here are the answers to your questions:
I hope this helps!
Kind regards, Leon
Thank you for your answer Leon.
I'm using a Kinect version 2 and I'm trying to make your code compatible with it.
So, for what I understood and please correct me if I'm wrong, you receive from the Kinect an array with dimensions 921600 x 1 as image color. Than you reshaped it in order to have a RGB image with dimensions (640,480). After, having the depth image and color image with the same size (640,480) you register the color image on the depth image in order to create a point cloud with dimensions (640*480, 3). And is with this point cloud that you will work after. Am I understanding your idea correctly?
Thank you, Roios
From the color camera a vector of dimension 921600 x 1 is obtained, which is reshaped to create a RGB image with dimensions 640x480x3.
From the depth camera a 2D matrix of dimension 640x480 is obtained. Due that both camera's are not in the same position registration is done, which aligns the depth data to the correct color data.
The pointcloud after step 2 is used to find the dominant plane. Once the dominant plane is found it is easy to assign color values with the data from step 3.
Good luck with your code! Leon
Once again I was perfectly clear...thank you :) Now if you allow me to continue my questions we pass to the detect_plane function. I read the function and it makes sense with the help of your article. Although, I have some doubts about the voxels. I ask my questions with the code in order to be easier: At this moment we have the estimated normals for each point of the point cloud. We also smoother the normals.
your code:
%define edges of surface normals -1 to 1
edges = -1.01:2.02/grid:1.01;
% place surface normals in histogram
[~,vox_x] = histc(nx,edges);
[~,vox_y] = histc(ny,edges);
[~,vox_z] = histc(nz,edges);
my question: So here you check for each point if the normal component tends to -1 or 1. Am I correct?
your code:
% create 3D voxel grid output
voxels = (vox_x + (vox_y-1).*grid + (vox_z-1).*grid^2);
my question: Here you say you create a 3D voxel grid. But when we analise the variable "voxels" is a 2D matrix. Also, I do not understand what you try to compute with (vox_x + (vox_y-1).*grid + (vox_z-1).*grid^2)
your code:
% obtain direction dominant plane normal
direction = mode(voxels(mask));
my question: I didn't understand the values of the voxels but if I'm not wrong here you check what are the most common value on the grid where the depth values are valid.
your code:
% create 3D histogram of all surface normals
edges = (0:1:grid^3)+0.5;
h = histc(voxels(mask),edges);
my question: What are you doing here?
Once again, thank you very much Leon!
Ok this is a bit harder to explain. Let it be clear that the article isn't written by me, but i used it for my thesis.
The basic idea is that each surface normal [x y z] is placed in a 3D voxel grid (histogram), from which the dominant direction is determined. As example in my code a grid of 3 is used, which means there are (3x3x3) 27 voxels/bins for the surface normal histogram.
%define edges of surface normals -1 to 1
edges = -1.01:2.02/grid:1.01;
% place surface normals in histogram
[~,vox_x] = histc(nx,edges);
[~,vox_y] = histc(ny,edges);
[~,vox_z] = histc(nz,edges);
Each surface normal channel x,y and z ranges from -1 to 1. A histogram is created for each channel with a number of bins equal to grid. The bin index is used to create a 3D voxel grid (next step). In my example code each channel value can either be 1, 2 or 3. An example surface normal [x y z] = [0.1 -0.8 0.95] would look like [2 1 3] after binning.
% create 3D voxel grid output
voxels = (vox_x + (vox_y-1).*grid + (vox_z-1).*grid^2);
You are correct: vox_x, vox_y and vox_z are 2D. The formula above combines each channel (x,y,z) to create a 3D voxel grid. Example vectors with a grid size of 3: [1 1 1] = 1, [3 3 3] = 27 and [2 1 3] = 20. Via this method each surfance normal is represented by a single value/voxel based on a 3D grid.
% obtain direction dominant plane normal
direction = mode(voxels(mask));
Yes correct, this finds the most common value. In my example the value most occuring between 1 and 27. The mask is to make sure that each processed voxel has a depth measure.
% create 3D histogram of all surface normals
edges = (0:1:grid^3)+0.5;
h = histc(voxels(mask),edges);
This piece of code is used to merge neighboring voxels with similar orientations. It creates a histogram of all voxels based on the grid size (In my example ranging from 1 to 27).
... From this histogram clusters/voxels are selected containing more than a predefined number of surface normals. The mean surface normal is calculated for each selected voxel and compared to the mean dominant plane orientation. When the distance is small enough the cluster/voxel means are merged to give a more accurate result.
In final a refinement in distance space is done (line: 106).
Hello, I would like to ask you a few questions: 1- kinect version 1 or 2? 2- how did you connect with the kinect? you have the kinect_mex(). Could tell me what is that or where did you find that mex? 3- the resolution when you capture the pics is the same for both RGB and Depth? 4 - could you explain me what this means? rgb = permute(reshape(rgb,[3 res]),[3 2 1]);´
Thank you very much