kujason / avod

Code for 3D object detection for autonomous driving
MIT License
933 stars 349 forks source link

What is the meaing of each value of output file? #188

Closed Haawron closed 3 years ago

Haawron commented 3 years ago

Firstly, thanks for the beautiful paper and repo!

I have followed your README instructions and successfully got an output file like 000001.txt in ./avod/data/outputs/[configname]_val/predictions/final_boxes_box_4ca_and_scores/val/120000/. That file would be like:

-15.24178 -14.89511 -16.67474 -17.01962 57.73212 54.32695 54.19614 57.59904 -0.34679 1.15983 0.05014 0.00000
-15.93977 -16.03211 -17.78551 -17.69343 61.18032 57.73060 57.73194 61.18008 -0.31289 1.17731 0.01493 0.00000
4.54089 4.17583 2.42572 2.79408 43.76081 39.97421 40.00064 43.79296 -0.28312 1.19874 0.00714 0.00000
-1.20005 -0.82658 -2.48288 -2.85900 67.93431 65.22529 65.07829 67.79079 0.73463 2.23201 0.00605 0.00000
4.82551 4.61045 2.91491 3.12845 63.55791 59.75579 59.82858 63.63762 0.04117 1.54963 0.00646 0.00000
4.89932 4.58680 2.84608 3.16217 46.90687 43.79197 43.88605 47.00644 -0.33633 1.14318 0.00413 0.00000
5.32585 5.31228 3.69754 3.71186 8.41151 4.49657 4.48897 8.42451 -1.04176 0.44200 0.00365 0.00000
...

It is an output of KITTI-stereo-test-000001.png

I (briefly) read your paper so I know what each column means. (four corners of the bottom plane and heights: Δx1, ..., Δx4, Δz1, ..., Δz4, h1, h2, score, type) However, I can't get any information about what VALUES MEAN AND ITS UNIT. Consequently, my questions are:

  1. Is it in pixels or meters?
  2. From where? Is it from the center or upper left of the image? or the camera(which took this image)?
  3. I want just to draw boxes on the image. How can I do that? I found some functions in your /demos folder but they are needed to set many options. Accordingly, I will need to spend many hours to operate those functions. Could you give me some other ways to do it?

Thank you in advance.

kujason commented 3 years ago
  1. Values are in metres, the outputs are the final box_4c values, not the deltas. See https://github.com/kujason/avod/wiki/Data-Formats for more details on the data formats.
  2. From the camera frame
  3. demos/show_predictions_2d.py will draw the boxes projected into the image. This script uses the boxes_3d format which should also be available in the predictions output folder.