The drone needs to do object detection to have any real utility. That means finding the 3D coordinates of nearby objects(as many as possible). These detections will then be used to aid in Path planning, since the motion of the detected obstacles can be used to model and predict their state in the future which will help plan safer trajectories.
Since the bot is equipped with a lidar and a stereo camera as well, we decided to go ahead with just stereo cameras for now. This is mainly to make the pipeline as streamlined and fast as possible. The process of getting the depth information from stereo image pairs is called stereopsis. We and all other 2-eyed animals do it all the time. Checkout OpenCV: Depth Map from Stereo Images
The current plan is to directly apply the object detector on the stereo images, without calculating the disparity map or the point cloud. Again, this is done to speed things up; Disparity calculation takes time(checkout this link at computerphile: Stereo 3D Vision to get a feel.
Still, there are existing ros packages like stereo_image_proc - ROS Wiki that do it, if one wants to try.
What’s done and what needs to be:
[x] Research and find the approaches that give a good mix of accuracy and speed. Here’s an non-exhaustive list: vision comparison.
The candidates under consideration are either RCNN(e.g. Stereo-RCNN) or YOLO based (e,g. Complex YOLO)
RCNN based algorithms are more accurate but slower; and ones like YOLO which directly output in a single pass are faster but less accurate. Currently trying to improve on both aspects.
[ ] [WIP] Recreate/Modify and implement
Implement the DL pipeline.
The current network uses HRNet as the base, and the KITTI stereo dataset to train.
[ ] Integrate with our quadcopter
The current pipeline uses the camera calibration data from KITTI, which is going to be different for our model. So make necessary changes and create a ROS node that reads data from the stereo camera and outputs the Object points.
The drone needs to do object detection to have any real utility. That means finding the 3D coordinates of nearby objects(as many as possible). These detections will then be used to aid in Path planning, since the motion of the detected obstacles can be used to model and predict their state in the future which will help plan safer trajectories.
Since the bot is equipped with a lidar and a stereo camera as well, we decided to go ahead with just stereo cameras for now. This is mainly to make the pipeline as streamlined and fast as possible. The process of getting the depth information from stereo image pairs is called stereopsis. We and all other 2-eyed animals do it all the time. Checkout OpenCV: Depth Map from Stereo Images
The current plan is to directly apply the object detector on the stereo images, without calculating the disparity map or the point cloud. Again, this is done to speed things up; Disparity calculation takes time(checkout this link at computerphile: Stereo 3D Vision to get a feel. Still, there are existing ros packages like stereo_image_proc - ROS Wiki that do it, if one wants to try.
What’s done and what needs to be: