The PyTorch Implementation of the paper: RTM3D: Real-time Monocular 3D Detection from Object Keypoints for Autonomous Driving (ECCV 2020)
--arch fpn_resnet_18
)use_left_cam_prob
argument)Formula (3):
log
operator, so please don't normalize dim as mentioned in
the paper because the normalized dim values maybe less than 0
. Hence I've directly regressed to absolute dimension values in meters.L1 loss
for depth estimation (applying the sigmoid
activation to the depth output first).Formula (5): I haven't taken the absolute values of the ground-truth, I have used the relative values instead. The code is here
Formula (7): argmin
instead of argmax
Generate heatmap for the center and vertexes of objects as the CenterNet paper. If you want to use the strategy from RTM3D paper,
you can pass the dynamic-sigma
argument to the train.py
script.
pip install -U -r requirements.txt
Download the 3D KITTI detection dataset from here.
The downloaded data includes:
Please make sure that you construct the source code & dataset directories structure as below.
The model takes only the RGB images as the input and outputs the main center heatmap
, vertexes heatmap
,
and vertexes coordinate
as the base module to estimate 3D bounding box
.
cd src/data_process
python kitti_dataset.py
Then Press n to see the next sample >>> Press Esc to quit...
Download the trained model from here (will be released),
then put it to ${ROOT}/checkpoints/
and execute:
python test.py --gpu_idx 0 --arch resnet_18 --pretrained_path ../checkpoints/rtm3d_resnet_18.pth
python evaluate.py --gpu_idx 0 --arch resnet_18 --pretrained_path <PATH>
python train.py --gpu_idx 0 --arch <ARCH> --batch_size <N> --num_workers <N>...
We should always use the nccl
backend for multi-processing distributed training since it currently provides the best
distributed training performance.
python train.py --dist-url 'tcp://127.0.0.1:29500' --dist-backend 'nccl' --multiprocessing-distributed --world-size 1 --rank 0
First machine
python train.py --dist-url 'tcp://IP_OF_NODE1:FREEPORT' --dist-backend 'nccl' --multiprocessing-distributed --world-size 2 --rank 0
Second machine
python train.py --dist-url 'tcp://IP_OF_NODE2:FREEPORT' --dist-backend 'nccl' --multiprocessing-distributed --world-size 2 --rank 1
To reproduce the results, you can run the bash shell script
./train.sh
logs/
folder and cd logs/<saved_fn>/tensorboard/
tensorboard --logdir=./
If you think this work is useful, please give me a star!
If you find any errors or have any suggestions, please contact me (Email: nguyenmaudung93.kstn@gmail.com
).
Thank you!
@article{RTM3D,
author = {Peixuan Li, Huaici Zhao, Pengfei Liu, Feidao Cao},
title = {RTM3D: Real-time Monocular 3D Detection from Object Keypoints for Autonomous Driving},
year = {2020},
conference = {ECCV 2020},
}
@misc{RTM3D-PyTorch,
author = {Nguyen Mau Dung},
title = {{RTM3D-PyTorch: PyTorch Implementation of the RTM3D paper}},
howpublished = {\url{https://github.com/maudzung/RTM3D-PyTorch}},
year = {2020}
}
[1] CenterNet: Objects as Points paper, PyTorch Implementation
${ROOT}
└── checkpoints/
├── rtm3d_resnet_18.pth
├── rtm3d_fpn_resnet_18.pth
└── dataset/
└── kitti/
├──ImageSets/
│ ├── test.txt
│ ├── train.txt
│ └── val.txt
├── training/
│ ├── image_2/ (left color camera)
│ ├── image_3/ (right color camera)
│ ├── calib/
│ ├── label_2/
└── testing/
│ ├── image_2/ (left color camera)
│ ├── image_3/ (right color camera)
│ ├── calib/
└── classes_names.txt
└── src/
├── config/
│ ├── train_config.py
│ └── kitti_config.py
├── data_process/
│ ├── kitti_dataloader.py
│ ├── kitti_dataset.py
│ └── kitti_data_utils.py
├── models/
│ ├── fpn_resnet.py
│ ├── resnet.py
│ ├── model_utils.py
└── utils/
│ ├── evaluation_utils.py
│ ├── logger.py
│ ├── misc.py
│ ├── torch_utils.py
│ ├── train_utils.py
├── evaluate.py
├── test.py
├── train.py
└── train.sh
├── README.md
└── requirements.txt
usage: train.py [-h] [--seed SEED] [--saved_fn FN] [--root-dir PATH]
[--arch ARCH] [--pretrained_path PATH] [--head_conv HEAD_CONV]
[--hflip_prob HFLIP_PROB]
[--use_left_cam_prob USE_LEFT_CAM_PROB] [--dynamic-sigma]
[--no-val] [--num_samples NUM_SAMPLES]
[--num_workers NUM_WORKERS] [--batch_size BATCH_SIZE]
[--print_freq N] [--tensorboard_freq N] [--checkpoint_freq N]
[--start_epoch N] [--num_epochs N] [--lr_type LR_TYPE]
[--lr LR] [--minimum_lr MIN_LR] [--momentum M] [-wd WD]
[--optimizer_type OPTIMIZER] [--steps [STEPS [STEPS ...]]]
[--world-size N] [--rank N] [--dist-url DIST_URL]
[--dist-backend DIST_BACKEND] [--gpu_idx GPU_IDX] [--no_cuda]
[--multiprocessing-distributed] [--evaluate]
[--resume_path PATH] [--K K]
The Implementation of RTM3D using PyTorch
optional arguments:
-h, --help show this help message and exit
--seed SEED re-produce the results with seed random
--saved_fn FN The name using for saving logs, models,...
--root-dir PATH The ROOT working directory
--arch ARCH The name of the model architecture
--pretrained_path PATH
the path of the pretrained checkpoint
--head_conv HEAD_CONV
conv layer channels for output head0 for no conv
layer-1 for default setting: 64 for resnets and 256
for dla.
--hflip_prob HFLIP_PROB
The probability of horizontal flip
--use_left_cam_prob USE_LEFT_CAM_PROB
The probability of using the left camera
--dynamic-sigma If true, compute sigma based on Amax, Amin then
generate heamapIf false, compute radius as CenterNet
did
--no-val If true, dont evaluate the model on the val set
--num_samples NUM_SAMPLES
Take a subset of the dataset to run and debug
--num_workers NUM_WORKERS
Number of threads for loading data
--batch_size BATCH_SIZE
mini-batch size (default: 16), this is the totalbatch
size of all GPUs on the current node when usingData
Parallel or Distributed Data Parallel
--print_freq N print frequency (default: 50)
--tensorboard_freq N frequency of saving tensorboard (default: 50)
--checkpoint_freq N frequency of saving checkpoints (default: 5)
--start_epoch N the starting epoch
--num_epochs N number of total epochs to run
--lr_type LR_TYPE the type of learning rate scheduler (cosin or
multi_step)
--lr LR initial learning rate
--minimum_lr MIN_LR minimum learning rate during training
--momentum M momentum
-wd WD, --weight_decay WD
weight decay (default: 1e-6)
--optimizer_type OPTIMIZER
the type of optimizer, it can be sgd or adam
--steps [STEPS [STEPS ...]]
number of burn in step
--world-size N number of nodes for distributed training
--rank N node rank for distributed training
--dist-url DIST_URL url used to set up distributed training
--dist-backend DIST_BACKEND
distributed backend
--gpu_idx GPU_IDX GPU index to use.
--no_cuda If true, cuda is not used.
--multiprocessing-distributed
Use multi-processing distributed training to launch N
processes per node, which has N GPUs. This is the
fastest way to use PyTorch for either single node or
multi node data parallel training
--evaluate only evaluate the model, not training
--resume_path PATH the path of the resumed checkpoint
--K K the number of top K