This PR aims to reproduce CC-3DT training on nuScenes dataset.
Features
Data
MultiViewDataset
[x] Support sorting function.
[x] Fix UniformViewSampler with nonconsecutive frame ids.
[x] Support retry and SequentialViewSampler.
CBGSDataset
[x] Support CBGS to solve class unbalanced issues.
ResampleDataset
[x] Support resampling index to align the numbers of samples within an epoch to other codebases (MMEngine & Detectron 2).
Datasets
[x] Refactor nuScenes Monocular datasets.
[x] Refactor VideoDataset. Change video_to_indices to VideoMapping which contains video_to_indices and video_to_frame_ids to handle nonconsecutive frames.
Engine
LossModule
[x] Support loss weight in dictionary format.
Eval
NuScenes Evaluator
[x] Split 3D Detection & 3D Tracking evaluators.
[x] Refactor inputs to align vis4d.data.const.CommonKeys.
[x] Support 3D Detection evaluation code while 3D Tracking only saves prediction due to dependency issues.
Model
QDTrack
[x] Split QDTrack int QDTrackHead (vis4d.op.track) and QDTrackGraph (vis4d.state.track) to make them easy to reuse.
CC-3DT
[x] Split CC3DTrack into QDTrackHead and CC3DTrackGraph.
[x] Create CC3DTrackGraph in vis4d.state.track3d which updates the track memory with association and motion model.
[x] Refactor QD3DTBBox3DHead to a single functor.
[x] Refactor Box3DUncertaintyLoss and move it to vis4d.op.
[x] Refactor Track3DOut to align vis4d.data.const.CommonKeys.
[x] Add training configs.
[x] Add model zoo.
[x] Reproduce R50 performance.
[x] Reproduce R101 performance.
Op
Weight initialization
[x] Support weight initialization functions.
Bug Fixes
[x] Rename some module names (detect_3d to detect3d / track_3d to track3d / vis4d.op.detect_3d.filter to vis4d.op.detect3d.util).
[x] Fix the in-place operation of the shared convolution in RCNNHead.
[x] Fix missing weight init in RCNNHead and QD3DTBBox3DHead.
[x] Resolve the ordering issue of visualizer callback and COCO evaluator.
[x] Fix get_rank function on local machine.
[x] Fix loss logging.
[x] Fix the learning rate scheduler for OnecyleLR.
This PR aims to reproduce CC-3DT training on nuScenes dataset.
Features
Data
MultiViewDataset
UniformViewSampler
with nonconsecutive frame ids.SequentialViewSampler
.CBGSDataset
ResampleDataset
Datasets
video_to_indices
toVideoMapping
which containsvideo_to_indices
andvideo_to_frame_ids
to handle nonconsecutive frames.Engine
LossModule
Eval
NuScenes Evaluator
vis4d.data.const.CommonKeys
.Model
QDTrack
QDTrack
intQDTrackHead
(vis4d.op.track
) andQDTrackGraph
(vis4d.state.track
) to make them easy to reuse.CC-3DT
CC3DTrack
intoQDTrackHead
andCC3DTrackGraph
.CC3DTrackGraph
invis4d.state.track3d
which updates the track memory with association and motion model.QD3DTBBox3DHead
to a single functor.Box3DUncertaintyLoss
and move it tovis4d.op
.Track3DOut
to alignvis4d.data.const.CommonKeys
.Op
Weight initialization
Bug Fixes
detect_3d
todetect3d
/track_3d
totrack3d
/vis4d.op.detect_3d.filter
tovis4d.op.detect3d.util
).RCNNHead
.RCNNHead
andQD3DTBBox3DHead
.get_rank
function on local machine.