Overview of D3T: Our D3T model consists of two stages. Burn-in Stage: We initiate the training of the object detector using labeled data from the RGB domain. Zigzag Learning Stage: Comprises two distinct and interleaved training components for the Thermal domain and the RGB domain, respectively. During each step of training, the student model utilizes images from a single domain for training but leverages knowledge from two teachers for enhanced learning effectiveness. In each step, only one teacher model is updated corresponding to the trained domain.
# Prepare environments via conda
conda create -n D3T python=3.8.5
conda activate D3T
conda install pytorch==1.10.1 torchvision==0.11.2 torchaudio==0.10.1 cudatoolkit=11.3 -c pytorch -c conda-forge
# install cvpods
git clone https://github.com/EdwardDo69/D3T.git
cd D3T
python3 -m pip install -e cvpods
# recommend wandb for visualizing the training
pip install wandb
pip install imgaug
# Install some spectial version
pip install numpy==1.20.3
pip install setuptools==59.5.0
pip install Pillow==9.2.0
pip install scikit-learn
All the data arrangements follow the format of PASCAL_VOC. The dataset files are in the folder of cvpods/data/
and the config path are in cvpods/cvpods/data/datasets/paths_route.py
. Please refers to cvpods.
[data]
├── FLIR_ICIP2020_aligned
├── AnnotatedImages
├── Annotations
├── ImageSets
└── JPEGImages
Annotations
, JPEGImages
and AnnotatedImages
to ./cvpods/data
We use the VGG16 as the backbone, the pretrained model can be downloaded from this link. Then the MODEL.WEIGHTS
should be updated in config.py
correspondingly.
cd experiment/flir_rgb2thermal/
CUDA_VISIBLE_DEVICES=0,1,2,3 pods_train --dir . --dist-url "tcp://127.0.0.1:29007" --num-gpus 4 OUTPUT_DIR 'outputs/thermal'
wandb
, specify wandb account in runner.py
and then add WANDB True
into the command.CUDA_VISIBLE_DEVICES=0 pods_test --dir . --num-gpus 1 MODEL.WEIGHTS $model_path
Ex:
CUDA_VISIBLE_DEVICES=1 pods_test --num-gpus 1 --dir . --dist-url "tcp://127.0.0.1:29055" \
MODEL.WEIGHTS D3T/experiment/flir_rgb2thermal/outputs/thermal/best.pth \
OUTPUT_DIR D3T/experiment/flir_rgb2thermal/outputs/test
Note that if you provide a relative model path, the $model_path
is the relative path to cvpods
. It is recommended to use the absolute path for loading the right model.
To facilitate the verification of our results, we provide our checkpoint for the FLIR and KAIST dataset
. Please download it from the following link.
This repo is developed based on Harmonious Teacher and cvpods. Please check Harmonious Teacher and cvpods for more details and features.
@inproceedings{do2024d3t,
title={D3T: Distinctive Dual-Domain Teacher Zigzagging Across RGB-Thermal Gap for Domain-Adaptive Object Detection},
author={Do, Dinh Phat and Kim, Taehoon and Na, Jaemin and Kim, Jiwon and Lee, Keonho and Cho, Kyunghwan and Hwang, Wonjun},
booktitle={Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition},
pages={23313--23322},
year={2024}
}
This repo is released under the Apache 2.0 license. Please see the LICENSE file for more information.
For inquiries, please contact: phatai@ajou.ac.kr