This repository contains the work in scope of my master thesis, during which he implemented an object segmentation and 6d-pose estimation system for industrial metallic parts targeting at robotics Bin-Picking tasks based on the DenseFusion paper.
The ma_densefusion system is capable of segmenting a texturelose part from it's background(usually a blue box) and generating a mask, the mask is then feed into an iterative pose estimation network adapted from the original implementation. The two separate pipelines(i.e., segmentation and pose estimation) can be combined together and run in real time with an inference time of around 0.08s, both for single and mutiple objects scenario. In case of multiple objects scenario, the mask with the maximal area is simply chosen as the final mask to be feed into the pose estimation network. As the roboter arm grips one part at once for Bin-Picking applications, the "maximal area" strategy should be enough.
datasets
densefusion_ros: single-node ros package that subscribes depth and rgb topics for segmentation and pose estimation utilizing the pre-trained models
experiments:useful scripts for model training and evalution
generategtfromposedata: scripts for generating ground truth poses from datatset for evalution using vsd, adi, recall rate metrics
lib:
man_stl: CAD models in .stl format
seg:original and pruned version of semantic segmentation network
tools
trained_models: pre-trained models,which includes a pose estimation network and a refiner network model
useful:some useful scripts that save you a lot of dirty work like starting docker and camera, republishing ros topics to LCM format, converting datasets from LabelFusion format to our format, etc.
Paper.pdf: An example paper from previous work
The Datasets used in this project can be downloaded from (deleted to open-source this repo). After downloading, move the folders named 01,02,03 to the datasets/data folder and the .ply files to the datasets/models folder, then you can start to train or evaluate. Note that you may need to delete the original subfolders in datasets/data folders(i.e., 01,02,03), they are actually the same as their counterparts in the downloaded datasets.
The LabelFusion is an awesome tool for creating own datastes. It was designed, however,to accept as input messages in the LCM message format that matches the openni2-camera.lcm driver. An intermidate layer,such as the rgbd_ros_to_lcm ros package,is necessary when the pipeline is used with a RealSense cmaera. To use this ros package, specify the "rgb_topic" and "depth_topic" in "input parameters" part of the file "lcm_republisher.launch" as "/camera/color/image_raw" and "/camera/aligned_depth_to_color/image_raw",respectively.
There are also some necessary modifications in the original LabelFusion implementation to accept different camera intrinsic parameters:
LabelFusion/modules/labelfusion/rendertrainingimages.py
to the the following:
def setCameraInstrinsicsAsus(view):
principalX = 315.2859903333336
principalY = 244.88168334960938
focalLength = 616.0936279296875
setCameraIntrinsics(view, principalX, principalY, focalLength)
LabelFusion/scripts/prepareForObjectAlignment.py
to the following:
os.system(path_to_ElasticFusion_executable + " -l ./" + lcmlog_filename+ " -cal ./camera.cfg")
fx fy cx cy
in just one line.The rest is the same as what is documented in LabelFusion's original pipeline.
Note1: If you encounter error message which says "Leaf size is too small for the input dataset. Integer indices would overflow" when run "run_alignment_tool". It's possibly because that the coordinate values of the model mesh is in mm unit, chaning them to m(i.e., rescale by factor of 1000) should solve the problem. To do that, you can for example utilize the blender program.
Note2: If you encounter "GLXBadContext 169", or"GLXBadDrawable 171" error, it's possibly bacause you're using nvidia-docker2, which is deprecated. You can fix that by using nvidia-docker and following a history version of the offical tutorial.
The images generated by LabelFusion with adjacent indices are usually similar to each other, morerover, the formats of original datasets and what is required by DenseFusion are different. All those make it impossible to use the orginal datastes directly in your code. That's why we need "generatedataset.py" in "/useful". To use this script, make a new directory called "/mydataset" wherever you like, then make a new directory called "/segmentation" and a new file called "posedata.yml" in "/mydataset", also copy the images generated by LabelFusion and "generatedataset.py" to "/mydataset", then change directory to the newly created folder "/segmentation", inside the "/segmentation" folder, make three new folders called "/mask", "/depth" and "/rgb", then change directory back to "/mydataset" and run "python generatedataset.py".
The point cloud of .ply format can be obtained by using this python script. If you need a .pcd format point cloud absolutely,try the "pcl_converter" tool after installing the PCL library, the syntax is as follows:
pcl_converter -f ascii 0000.ply 0000.pcd
You may encounter a "Cannot read geometry" error, this is due to the fact that the .ply file does not declare any information about faces(just vertices).A workaround is to add the following two lines after the line "property uchar alpha" in the header part of the .ply file.
element face 0
property list uchar int vertex_indices
In the /ma_densefusion folder, run:
./experiments/scripts/train_poseownnet.sh
In the /ma_densefusion folder, run:
./experiments/scripts/eval_poseownnet.sh
In the /ma_densefusion/seg/ folder, run:
python3 train.py
In the /ma_densefusion/seg/ folder, run:
python3 segeval.py
Note that you may need to adjust some lines of code to get segmentation results for just one picture or the whole datasets according to your need, see comments in the python script for detail.
Copy the /densefusion_ros folder to your ROS catkin_ws/src, soure your catkin_ws/devel/setup.bash file,change directory to catkin_ws/src/densefusion_ros/src, and then run:
rosrun densefusion_ros densefusion_ros.py --model=flansch
the option "--model" allows you to specify which object you are woking with, change "--model=flansch" to "--model=schaltgabel" or "--model=stift" if you want to detect other objects.