PRBonn / semantic-kitti-api

SemanticKITTI API for visualizing dataset, processing data, and evaluating results.
http://semantic-kitti.org
MIT License
783 stars 187 forks source link

Scene completion voxel centroids #46

Closed yh675 closed 4 years ago

yh675 commented 4 years ago

Hi,

I would like to get xyz centroids of the voxels (for the completed scene) from the binary files for the semantic scene completion, but I am a little lost looking at visualize_voxels.py.

I'm assuming I should use unpack(), but I'm not sure how exactly to use this to loop over the binary file.

Could you point me towards how I might accomplish this?

Thanks

jbehley commented 4 years ago

To get the center coordinates of the voxels, one has to perform some calculations (the visualizer does not use the "real" parameters since it's only meant to visualize).

The voxel grid is initialized in the voxelizer (https://github.com/jbehley/voxelizer) and the parameters used to generate the voxel grids are inside https://github.com/jbehley/voxelizer/blob/master/assets/semantic_kitti.cfg). Important for the centers are:

min extent: [0, -25.6, -2]
max extent: [51.2, 25.6,  4.4]
voxel size: 0.2

These parameters determine the offsets and the size of the voxel grid.

The voxel grid offset and size is initialized as follows (Link):

  resolution_ = resolution;
  sizex_ = std::ceil((max.x() - min.x()) / resolution_);
  sizey_ = std::ceil((max.y() - min.y()) / resolution_);
  sizez_ = std::ceil((max.z() - min.z()) / resolution_);

  // ensure that min, max are always inside the voxel grid.
  float ox = min.x() - 0.5 * (sizex_ * resolution - (max.x() - min.x()));
  float oy = min.y() - 0.5 * (sizey_ * resolution - (max.y() - min.y()));
  float oz = min.z() - 0.5 * (sizez_ * resolution - (max.z() - min.z()));

min and max are the min extent and max extent from the configuration. resolution is the voxel size.

With the offset one can now compute for voxel with coordinates (i,j,k), the center (cx, cy, cz) accordingly:

float cx = i * resolution_ + 0.5 * resolution - ox;
float cy = j * resolution_ + 0.5 * resolution - oy;
float cz = k * resolution_ + 0.5 * resolution - oz;

For the completed voxel grid (training set), one has to consider all non-zero labels and from the linearized 1D index of the label, one can compute the (i,j,k) coordinates as follows (Link):

 int i = idx / float(sizey_ * sizez_);
 int j = (idx - i * sizey_* sizez_) / float(sizez_);
 int k = (idx - i * sizey_ * sizez_ - j * sizez_);

Hope this helps to get the information you want.

yh675 commented 4 years ago

Thanks for the response!

I got it to work.