Closed DeepHM closed 2 years ago
Thank you for your interest in our work. DAFormer was trained on a Nividia RTX 2080 Ti with 11 GB memory. According to the training logs, 9.7 GB of the GPU memory were utilized.
I use a RTX 2080Ti with 11 GB memory to run this code with default parameters and get an error: RuntimeError CUDA out of memory. Tried to allocate 128.00MiB(GPU 0; 10.76 Gib total capacity; 9.01 Gib already allocated; 156.00 Mib free; 9.15 Gib reversed in total by pytorch). This machine only runs this one program, and the monitors occupies 270M graphics card memory. I am puzzled why "CUDA out of memory" appears? When I set batch_size=1 it can train normally. It takes 8.5 hours to train 40000 iterations in the Gta->cityscapes task.
I have used a machine without display output. Maybe this makes the difference.
To reduce GPU memory consumption, you can try to share the backward pass of source and FD loss:
# Train on source images
clean_losses = self.get_model().forward_train(
img, img_metas, gt_semantic_seg)
clean_loss, clean_log_vars = self._parse_losses(clean_losses)
log_vars.update(clean_log_vars)
# ImageNet feature distance
if self.enable_fdist:
feat_loss, feat_log = self.calc_feat_dist(img, gt_semantic_seg, src_feat)
log_vars.update(add_prefix(feat_log, 'src'))
clean_loss = clean_loss + feat_loss
# Shared source backward
clean_loss.backward()
del clean_loss
if self.enable_fdist:
del feat_loss
Thank you very much for your help, I have solved this problem under your guidance. Thank you for the work of your team too!
I'm looking into some research. I would appreciate it if you could let me know the GPU memory needed for training.