ApolloAuto / apollo

An open autonomous driving platform
Apache License 2.0
25.12k stars 9.7k forks source link

YOLO obstacle detector #8056

Closed alexiskovi closed 5 years ago

alexiskovi commented 5 years ago

Here we have output layers of yolo_obstacle_detector neural network:

As we got, in every unit there are multidimentional object information. But how to extract object position, class, and bounding boxes size data? If yolo_obstacle_detector can give this information itself, then:

  1. Which layer have probability distribution information?
  2. Which layer is for the class information?
  3. ... bounding boxes characteristics?
  4. Can this network recognize objects in standalone mode, without apollo code/docker? What it needs to become working outside apollo perception?

If no, then can you say, which neural networks take data from yolo_obstacle_detector layers and make final result?

Thanks in advance

techoe commented 5 years ago

@alexiskovi, Thank you for using Apollo! We added 3D bbox and classification information at the deconvolution layer. We recommend you to use the model in docker since there are multiple dependencies. There is the unit test code (apollo/modules/perception/camera/test/camera_lib_obstacle_detector_yolo_region_output_test.cc) you can use for stand alone test in docker.

KaWaiTsoiBaidu commented 5 years ago

@alexiskovi , or you can use this offline standalone test (modules/perception/camera/tools/offline/offline_obstacle_detector.cc).

alexiskovi commented 5 years ago

@techoe, @KaWaiTsoiBaidu, thanks a lot! Hope, it will help. Could you explain, what these layers mean? Maybe there are some docs about apollo nets structure?

techoe commented 5 years ago

Please take a look at the prototxt. There are comments for each layer. 3D bounding box properties and left/right turn signals, brake signals are the output of the network.

KaWaiTsoiBaidu commented 5 years ago

Prototxt here

natashadsouza commented 5 years ago

Closing this issue as it appears to be fixed. Feel free to reopen if you have additional questions. Thanks!

alexiskovi commented 5 years ago

So, I've found comment about bbox But, unfortunately, i didn't get how to interpritate deconvolutional layer output information. Maybe you pointed another comment or another layer? Because ordinary output vector doesn't look like yolo output vector. So, general question here - is bbox data stored in one of mentioned layers, or we need to do some extra actions?

KaWaiTsoiBaidu commented 5 years ago

@alexiskovi the output for the 3D bbox are dim_pred and ori_pred, one for 3D bbox dimensions (H,W,L) and one for 3D bbox orientation . It does not output location (X,Y,Z) of the 3D bbox though. loc_pred stores the information of 2D bbox.

alexiskovi commented 5 years ago

Thanks for clarifying layer meanings. However, didn't get the idea, how to extract info from layers. So, here is output shape of loc_pred: (1, 50, 90, 64). It means, this vector have 64 values, in each vertex of the image net of size (50,90), but is it real to get bbox information in the final form, like coordinates, class and probability?

KaWaiTsoiBaidu commented 5 years ago

@alexiskovi The layer has 4*16=64 layers, 4-->(x,y,h,w) of the bbox, 16--> 16 anchor boxes for each vertex of the (50, 90) feature map. For converting the raw output to (x,y,h,w), please see this kernel code.

alexiskovi commented 5 years ago

@KaWaiTsoiBaidu, thank you a lot for making this clear!

alexiskovi commented 5 years ago

In each vertex of net we have 16 BBoxes. So, how to decide, which of (509016) Bboxes should be drown? Could you possibly point at line, which works with extracting Bbox significance for each vertex, may be there is another layer with shape (50,90,16) which of them corresponds to objects, which of them corresponds to noise.

Thanks in advance

KaWaiTsoiBaidu commented 5 years ago

@alexiskovi This output specifies the objectness (shape (50,90,16)) of each bounding box.

RezaMehrabian commented 4 years ago

@alexiskovi, Thank you for using Apollo! We added 3D bbox and classification information at the deconvolution layer. We recommend you to use the model in docker since there are multiple dependencies. There is the unit test code (apollo/modules/perception/camera/test/camera_lib_obstacle_detector_yolo_region_output_test.cc) you can use for stand alone test in docker.

How can I run this test?

RezaMehrabian commented 4 years ago

I use apollo 5.0 on ubuntu 16.04 and I built it.

Actually, my question is a general question. I know there are a lot of test files to test some parts of Apollo. However, I do not know how I can run it. I tried the command below but it is not working:

./apollo.sh /modules/perception/camera/test:camera_lib_obstacle_detector_yolo_region_output_test.cc