Open RobotLearner2022 opened 5 days ago
1.The img_backbone, img_neck and depth_branch are the same: ResNet50, FPN and DenseDepthNet.
2.SparseDrive's head consits of 3 parts: det_head
, map_head
and motion_plan_head
.
det_head
: Sparse4DHead is the same with Sparse4D object detection head. The only difference is that SparseDrive uses FlashAttention in replace of MultiheadAttention. FlashAttention uses memory optimization(HBM), which runs faster than MultiheadAttention. You can use MultiheadAttention in Deployment instead of FlashAttention ops.map_head
: Sparse4DHead is almost the same as the Sparse4D object detection head. But, task output of head is different:
a) The output task category of the former: dynamic traffic elements
class_names = [
"car",
"truck",
"construction_vehicle",
"bus",
"trailer",
"barrier",
"motorcycle",
"bicycle",
"pedestrian",
"traffic_cone",
]
b) While the output task category of the latter: static traffic elements
map_class_names = [
'ped_crossing',
'divider',
'boundary',
]
motion_plan_head
is unique component of SparseDrive. It's designed for end-to-end autonomous driving. So, it need to add motion and planning module: MotionPlanningHead. Actually, operators in MotionPlanningHead is also quite familiar to us:
operation_order=(
[
"temp_gnn",
"gnn",
"norm",
"cross_gnn",
"norm",
"ffn",
"norm",
] * 3 +
[
"refine",
]
),
We need to implement the following additional modules:
map_head
.motion_plan_head
.Personally, I think 90% of them is similar, so there are no plans to refactor SparseDrive in the near future.
Thank you for sharing the great work. Very helpful. I wonder if you have any plans to simplify SparseDrive (https://github.com/swc-17/SparseDrive)? Which part of the released code can be re-utilized? Thanks again!