SysCV / shift-detection-tta

This repository implements continuous test-time adaptation algorithms for object detection on the SHIFT dataset.
MIT License
18 stars 2 forks source link

training cost #1

Closed twangnh closed 1 year ago

twangnh commented 1 year ago

@mattiasegu Hi Mattia, thanks for sharing the detection code for SHIFT, Could you please give some information on the training cost, e.g. how long is the training take and how much is number of images on training and vallidation/test?

mattiasegu commented 1 year ago

Hi @twangnh! Thanks for your interest in our codebase.

Training time depends on the chosen object detector and computational resources. I'll here report details for a YOLOX detector.


Source training (with 4 NVIDIA A40 GPUs)

We trained the source model on the clear-daytime discrete subset (see config file). The number of training images is ~10400. Training for 24 epochs took 4:30 hours.

You can check the training log on the source domain here 20230621_184939.log.


Continuous adaptation on the val set (with 1 NVIDIA GeForce RTX 3090 GPU)

For continuous test-time adaptation, we adapt / test the pre-trained source model to the set of sequences with continuous domain shift starting from clear daytime conditions.

Testing the no adaptation baseline on the validation set takes only a few minutes (<5 minutes). See the log file 20230625_200221.log.

Adaptation with the mean-teacher baseline takes roughly 1:30 hours on the validation set. See the log file 20230622_205656.log. Please notice that this baseline's hyperparameters were not tuned and performance can improve with hyperparameters tuning. Depending on the number of iterations with the optimizer, adaptation time will also change.


Continuous adaptation on the test set (with 1 NVIDIA GeForce RTX 3090 GPU)

Submitting to the challenge requires continuous adaptation on the test set. The test set is slightly bigger than the val set (8000 vs 2400 images). The test/adaptation time are <4x longer than on the validation set.

To sum it up, we choose the clear-daytime subset to make the challenge computationally feasible for most people. Hope this helps!