yukitsuji / 3D_CNN_tensorflow

KITTI data processing and 3D CNN for Vehicle Detection
MIT License
284 stars 120 forks source link

Is there any specific reason for not using Deconv layer for objectness map and bounding box map? #13

Closed ansabsheikh9 closed 6 years ago

ansabsheikh9 commented 6 years ago

hi @yukitsuji Paper mentions to use deconv for output layer to upsample the objectness and boundingbox. Could you comment on that? Thanks

yukitsuji commented 6 years ago

only for memory problem.

ansabsheikh9 commented 6 years ago

@yukitsuji Is there any possibility to detect multiple classes with this network? like for cars and pedestrians? Thanks

yukitsuji commented 6 years ago

yes, if you make the network like SSD or YOLO, you can detect multi-class objects. But, it is difficult to detect pedestrians by using lidar because lidar data is sparse. if you want to learn 3d object detection, i recommend below links. https://arxiv.org/abs/1711.06396 https://arxiv.org/abs/1711.08488

MeghaMaheshwari commented 6 years ago

But to detect multi class I think we also need to modify the Voxel as currently you are using a binary voxel. Arent you? I had another question, while loading the labels you load both Car and Van but than you have a binary voxel. So, did you club van and car into cars only or how does it work? Also I get multiple detections for a single object. Is that something you see as well?

ansabsheikh9 commented 6 years ago

@MeghaMaheshwari, How will you decide between different voxels while testing? Actually kitti doesnt count van as false detection for cars so usually it is cosidered as one class. to get rid of multiple boxes you need to implement Non max suppression which is not implemented in this code.

ansabsheikh9 commented 6 years ago

@yukitsuji Thank you for your recommendation