Open xziyh opened 2 years ago
Hi,
Can you give us the exact training script, including all the training arguments? So that we can check whether the memory is not enough indeed or there are some other reasons.
Hi, Thanks for ur reply,there are some arguments: lr=0.0001, lr_backbone=1e-05, batch_size=1, weight_decay=0.0001, epochs=1, lr_drop=40, clip_max_norm=0.1, frozen_weights=None, backbone='resnet50', dilation=False, position_embedding='sine', enc_layers=6, dec_layers=6, dim_feedforward=2048, hidden_dim=256, dropout=0.1, nheads=8, num_queries=300, pre_norm=False, masks=False, aux_loss=True, set_cost_class=2, set_cost_bbox=5, set_cost_giou=2, mask_loss_coef=1, dice_loss_coef=1, cls_loss_coef=2, bbox_loss_coef=5, giou_loss_coef=2, focal_alpha=0.25, dataset_file='coco', coco_path='coco', coco_panoptic_path=None, remove_difficult=False, output_dir='results', device='cuda', seed=42, resume='', start_epoch=0, eval=False, num_workers=2, world_size=1, dist_url='env://', distributed=False)
It is strange. From your arguments, you use resnet50 backbone without dilation. In this setting, 12GB memory should be well enough for batch_size 1. I do not have a clue. Maybe try to restart your computer to make sure all background programs that might comsume GPU memory are killed.
hello author, i use the 3080ti to train Conditional DETR with entire coco2017 datasets. But the programs report that cuda out of memory,3080ti has 12GB memory.I use the msi after burner to monitor the memory usage,and it shows the biggest memory usage is only 2520MB I set the batchsize to 1.