Closed Layjins closed 2 years ago
Hi @Layjins, I'm afraid that the limit cannot be lifted as this would be unfair to other participants.
One easy way to circumvent the GPU RAM limit while using distillation (like LwF) is to pre-record the output of the previous model for the current inputs (even including replay ones).
Hi @Layjins, I'm afraid that the limit cannot be lifted as this would be unfair to other participants.
One easy way to circumvent the GPU RAM limit while using distillation (like LwF) is to pre-record the output of the previous model for the current inputs (even including replay ones).
It would be a strict limitation for online knowledge distillation methods. Besides, pre-record the output of the previous model also limites the usage of online data augmentations in training the current model.
For incremental object detection, we think 16G GPU-memory is not enough: (1) The regular knowledge distillation strategy is adopted; (2) It strictly limits the batch size; (3) In object detection, 8-GPUs training setting is more regular.