A ROS package based on C++ for pedestrian detection, tracking and re-identification (The python package ptl_reid is deprecated, for reference, it remains in this project.). This package is designed for the task of counting the total numbers of pedestrians in an area, and showing their locations on a map.
Just imagining such a condition, you have a wheeled robot equipping with a lidar and a camera (or an RGBD camera). You ask it to go into a building and explore the building thoroughly. When the robot is exploring the building, it will detect and track the pedestrian in its sight, and report the locations of the pedestrians on a map. And after it finishes the exploration, it will tell you how many people are in the building, and the location where the robot see them most recently. This is what this project aims for.
Install the prerequisites
Build a workspace
mkdir -p ptl_ws/src
Clone this repository in /ws/src
cd ptl_ws/src
git clone https://github.com/HoEmpire/pedestrian_tracking_and_localizaiton.git
Build the files
catkin_make
The config files of each package can be found in ${PROJECT_NAME}/config/config.yaml
ptl_detector
cam:
cam_net_type: "YOLOV4_TINY" #net type
cam_file_model_cfg: "/asset/yolov4-tiny.cfg" #config file path
cam_file_model_weights: "/asset/yolov4-tiny.weights" #weight file path
cam_inference_precison: "FP32" #float precision
cam_n_max_batch: 1 #number of batch
cam_prob_threshold: 0.5 # the threshold of the detection probability
cam_min_width: 0 # the min/max width/height of an object in detection
cam_max_width: 1440
cam_min_height: 0
cam_max_height: 1080
ptl_tracker
basic:
use_compressed_image: true #using compressed image
use_lidar: true #enable using lidar
enable_pcp_vis: true #enable filtered pointcloud visualization
lidar_topic: /rslidar_points
camera_topic: /camera2/color/image_raw/compressed
map_frame: map
lidar_frame: rslidar
camera_frame: camera2_link
tracker:
track_fail_timeout_tick: 30 #if the tracker fails for track_fail_timeout_tick frames, we consider this tracker fails and remove it from the list.
bbox_overlap_ratio: 0.6
detector_update_timeout_tick: 30 #if the ticks after last update by detector is too long, we consider that we lose track of this target
detector_bbox_padding: 80 # to ensure overlap between detector and tracker, we pad the bounding box of the detector to enlarge it
reid_match_threshold: 3.0 #maximum feature distance to consider a match between a detected and a tracking object
reid_match_bbox_dis: 80 #maximum bbox center distance to consider a match between a detected and a tracking object
reid_match_bbox_size_diff: 80 #maximum bbox size distance to consider a match between a detected and a tracking object
stop_opt_timeout: 6
local_database:
height_width_ratio_min: 0.85 #only the image block with height/width falls in the range of (height_width_ratio_min, height_width_ratio_max) will be added into the local database.
height_width_ratio_max: 4.0
record_interval: 0.1 # the minimum time interval between two recorded images in a local database (Unit: s).
feature_smooth_ratio: 0.7
pc_processor:
resample_size: 0.1 # point cloud resample size(Unit:m)
x_min: 0.0 # point cloud conditional filter(Unit:m)
x_max: 15.0
z_min: 0.0
z_max: 4.04
std_dev_thres: 0.1 # statistial filter param
mean_k: 20 # statistial filter param
cluster_tolerance: 0.5
cluster_size_min: 20
cluster_size_max: 10000
match_centroid_padding: 20 # padding of bbox for robust reprojection matching between 2d bbox and 3d centroids
camera_intrinsic:
fx: 613.783
fy: 612.895
cx: 416.969
cy: 240.223
kalman_filter:
q_xy: 100 # bbox center position state variance (Unit: Pixel^2)
q_wh: 25 # bbox size state variance (Unit: Pixel^2)
p_xy_pos: 100 # bbox center position initial variance (Unit: Pixel^2)
p_xy_dp: 10000 # bbox center position velocity initial variance (Unit: Pixel^2)
p_wh_size: 25 # bbox size initial variance (Unit: (Pixel/s)^2)
p_wh_ds: 25 # bbox size velocity initial variance (Unit: (Pixel/s)^2)
r_theta: 0.08 # observation variance
r_f: 0.04
r_tx: 4
r_ty: 4
residual_threshold: 16 # if the residual is higher than this param, this observation will be rejected
kalman_filter_3d:
q_factor: 100 # position state variance (Unit (m/s^2)^2)
r_factor: 0.25 # position observation variance (Unit m^2)
p_pos: 100 # position initial variance (Unit m^2)
p_vel: 4 # position initial variance (Unit (m/s)^2)
start_predict_only_timeout: 10 #using the detector update tick as the timeout count, if this tick is higher than this param, will only use state predict result to tracker the position to avoid degeneration of false 2d tracking
stop_track_timeout: 15 #using the detector update tick as the timeout count, if this tick is higher than this param, will stop 3d tracking to avoid drift
outlier_threshold: 4.0 # if the residual ratio is higher than this param, this observation will be rejected
optical_flow:
min_keypoints_to_track: 40 # minimal keypoints to track for one object in keypoints_num_factor_area (e.g. 40 keypoints in 8000 pixel^2)
keypoints_num_factor_area: 8000
corner_detector_max_num: 100 # maximum keypoints to track for one object
corner_detector_quality_level: 0.0001 # corner detetion params...
corner_detector_min_distance: 3
corner_detector_block_size: 3
corner_detector_use_harris: true
corner_detector_k: 0.03
min_keypoints_to_cal_H_mat: 10 # minimal number of keypoints to calculate transformation matrix of an object. If the number of successfully tracked keypoints is less than this, will consider calculation fails
min_keypoints_for_motion_estimation: 50 # minimal number of keypoints to calculate transformation matrix of the motion of the platfrom. If the number of successfully tracked keypoints is less than this, will consider calculation fails
min_pixel_dis_square_for_scene_point: 2 # we use this param to remove scene point in the tracking bbox of an object
use_resize: true # using resize in optical flow tracking to speedup
resize_factor: 2 # resize ratio
bbox_overlap_ratio: if the overlapping area ratio of the bounding box(bbox) from the detector and the tracker is higher than this value, we match these two bounding boxes, and use the detector bbox to reinitialized the matched tracker.
Overlaping area ratio is calculated by
track_fail_timeout_tick: if the tracker fails for track_fail_timeout_tick frames, we consider this tracker fails and remove it from the list.
detector_update_timeout_tick: if the ticks after last update by detector is too long, we consider that we lose track of this target
detector_bbox_padding: to ensure overlap between detector and tracker, we pad the bounding box of the detector to enlarge it
reid_match_threshold: maximum feature distance to consider a match between a detected and a tracking object
reid_match_bbox_dis: maximum bbox center distance to consider a match between a detected and a tracking object
reid_match_bbox_size_diff: maximum bbox size distance to consider a match between a detected and a tracking object
stop_opt_timeout: when the ticks after last update by detector is larger than this param, we stop updating the tracker by optical flow, but only update the tracker by its state. The purpose is to prevent degeneration of performance when occlussion happens.
height_width_ratio_min/max: only the image block with height/width falls in the range of (height_width_ratio_min, height_width_ratio_max) will be added into the local database. record_interval: 0.1 # the minimum time interval between two recorded images in a local database (Unit: s).
feature_smooth_ratio: the current feature of a tracking object is calculated by:
resample_size: point cloud resample size(Unit:m)
x_min/x_max/z_min/z_max: point cloud conditional filter(Unit:m)
std_dev_thres/mean_k: statistial filter param
match_centroid_padding: padding of bbox for robust reprojection matching between 2d bbox and 3d centroids
q_xy: bbox center position state variance (Unit: Pixel^2)
q_wh: bbox size state variance (Unit: Pixel^2)
p_xy_pos: bbox center position initial variance (Unit: Pixel^2)
p_xy_dp: # bbox center position velocity initial variance (Unit: Pixel^2)
p_wh_size: # bbox size initial variance (Unit: (Pixel/s)^2)
p_wh_ds: 25 # bbox size velocity initial variance (Unit: (Pixel/s)^2)
r_theta/r_f/r_tx/r_ty: observation variance
residual_threshold: if the residual is higher than this param, this observation will be rejected
q_factor: position state variance (Unit (m/s^2)^2)
r_factor: position observation variance (Unit m^2)
p_pos: position initial variance (Unit m^2)
p_vel: position initial variance (Unit (m/s)^2)
start_predict_only_timeout: using the detector update tick as the timeout count, if this tick is higher than this param, will only use state predict result to tracker the position to avoid degeneration of false 2d tracking
stop_track_timeout: using the detector update tick as the timeout count, if this tick is higher than this param, will stop 3d tracking to avoid drift
outlier_threshold: if the residual ratio is higher than this param, this observation will be rejected
min_keypoints_to_track/keypoints_num_factor_area: minimal keypoints to track for one object in keypoints_num_factor_area (e.g. 40 keypoints in 8000 pixel^2)
corner_detector_max_num: maximum keypoints to track for one object
corner_detector_quality_level/corner_detector_min_distance/corner_detector_block_size/corner_detector_use_harris/corner_detector_k: corner detetion params...
min_keypoints_to_cal_H_mat: minimal number of keypoints to calculate transformation matrix of an object. If the number of successfully tracked keypoints is less than this, will consider calculation fails
min_keypoints_for_motion_estimation: minimal number of keypoints to calculate transformation matrix of the motion of the platfrom. If the number of successfully tracked keypoints is less than this, will consider calculation fails
min_pixel_dis_square_for_scene_point: we use this param to remove scene point in the tracking bbox of an object
use_resize: using resize in optical flow tracking to speedup
resize_factor: resize ratio
ptl_reid_cpp
reid_db:
similarity_test_threshold: 1.0
same_id_threshold: 1.6
batch_ratio: 0.5
max_feat_num_one_object: 50
use_inverted_file_db_threshold: 2500
feat_dimension: 2048
find_first_k: 2
nlist_ratio: 50
sim_check_start_threshold: 5
reid_inference:
engine_file_name: "reid_engine.engine"
onnx_file_name: "reid.onnx"
inference_offline_batch_size: 1
inference_real_time_batch_size: 1
ptl_node
detect_every_k_frame: 5 # perform detection in everyt k frame to reduce GPU load
lidar_topic: "/rslidar_points"
camera_topic: "/camera2/color/image_raw/compressed"
min_offline_query_data_size: 20 # the minimal data size (feature size + image size) to query a dead tracking object. This param is to make sure that we will not query the wrong detected object
Copy the .weight
file of yolo to ptl_ws/src/pedestrain_tracking_and_localizaiton/src/ptl_detector/asset
. Copy the .onnx
file of re-identification model (which can be obtained from fast-reid model zoo. You can also train your own model using fast-reid.) to ptl_ws/src/pedestrain_tracking_and_localizaiton/src/ptl_reid_cpp/asset
Launch the node
cd ptl_ws
source devel/setup.zsh
roslaunch ptl_node ptl_node
visualize the result
rosrun rviz rviz -d full_vis_2.rviz