FoundationVision / GenerateU

[CVPR2024] Generative Region-Language Pretraining for Open-Ended Object Detection
119 stars 6 forks source link

torch.cuda.OutOfMemoryError #10

Open liang315 opened 1 month ago

liang315 commented 1 month ago

File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/linear.py", line 114, in forward return F.linear(input, self.weight, self.bias) torch.cuda.OutOfMemoryError: CUDA out of memory. Tried to allocate 496.00 MiB. GPU 0 has a total capacty of 14.75 GiB of which 235.06 MiB is free. Process 20398 has 14.52 GiB memory in use. Of the allocated memory 14.06 GiB is allocated by PyTorch, and 330.42 MiB is reserved by PyTorch but unallocated. If reserved but unallocated memory is large try setting max_split_size_mb to avoid fragmentation. See documentation for Memory Management and PYTORCH_CUDA_ALLOC_CONF

liang315 commented 1 month ago

WARNING [05/17 10:24:18 fvcore.common.checkpoint]: The checkpoint state_dict contains keys that are not used by the model: head.{bias, weight} layers.0.blocks.0.attn.relative_position_bias_table layers.0.blocks.1.attn.relative_position_bias_table layers.0.blocks.1.attn_mask layers.1.blocks.0.attn.relative_position_bias_table layers.1.blocks.1.attn.relative_position_bias_table layers.1.blocks.1.attn_mask layers.2.blocks.0.attn.relative_position_bias_table layers.2.blocks.1.attn.relative_position_bias_table layers.2.blocks.1.attn_mask layers.2.blocks.2.attn.relative_position_bias_table layers.2.blocks.3.attn.relative_position_bias_table layers.2.blocks.3.attn_mask layers.2.blocks.4.attn.relative_position_bias_table layers.2.blocks.5.attn.relative_position_bias_table layers.2.blocks.5.attn_mask layers.3.blocks.0.attn.relative_position_bias_table layers.3.blocks.1.attn.relative_position_bias_table norm.{bias, weight} [05/17 10:24:18 d2.engine.train_loop]: Starting training from iteration 0 /opt/conda/lib/python3.10/site-packages/torch/utils/data/dataloader.py:557: UserWarning: This DataLoader will create 8 worker processes in total. Our suggested max number of worker in current system is 4, which is smaller than what this DataLoader is going to create. Please be aware that excessive worker creation might get DataLoader running slow or even freeze, lower the worker number to avoid potential slowness/freeze if necessary. warnings.warn(_create_warning_msg( ERROR [05/17 10:24:21 d2.engine.train_loop]: Exception during training: Traceback (most recent call last): File "/kaggle/working/GenerateU/detectron2/engine/train_loop.py", line 149, in train self.run_step() File "/kaggle/working/GenerateU/detectron2/engine/defaults.py", line 498, in run_step self._trainer.run_step() File "/kaggle/working/GenerateU/detectron2/engine/train_loop.py", line 273, in run_step loss_dict = self.model(data) File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1518, in _wrapped_call_impl return self._call_impl(*args, kwargs) File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1527, in _call_impl return forward_call(*args, *kwargs) File "/kaggle/working/GenerateU/projects/DDETRS/ddetrs/ddetrs_vl_uni.py", line 197, in forward output, loss_dict = self.detr.forward(images, targets, self.criterion, train=True, clip_object_descriptions_features=clip_object_descriptions_features, dataset_source=dataset_source, ann_type=ann_type) File "/kaggle/working/GenerateU/projects/DDETRS/ddetrs/models/segmentation_condInst_new_encodfpn.py", line 143, in forward self.detr.transformer(srcs, masks, poses, query_embeds, mask_on=True) File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1518, in _wrapped_call_impl return self._call_impl(args, kwargs) File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1527, in _call_impl return forward_call(*args, kwargs) File "/kaggle/working/GenerateU/projects/DDETRS/ddetrs/models/deformable_detr/deformable_transformer.py", line 153, in forward memory = self.encoder(src_flatten, spatial_shapes, level_start_index, valid_ratios, lvl_pos_embed_flatten, mask_flatten) File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1518, in _wrapped_call_impl return self._call_impl(*args, *kwargs) File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1527, in _call_impl return forward_call(args, kwargs) File "/kaggle/working/GenerateU/projects/DDETRS/ddetrs/models/deformable_detr/deformable_transformer.py", line 259, in forward output = layer(output, pos, reference_points, spatial_shapes, level_start_index, padding_mask) File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1518, in _wrapped_call_impl return self._call_impl(*args, kwargs) File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1527, in _call_impl return forward_call(*args, *kwargs) File "/kaggle/working/GenerateU/projects/DDETRS/ddetrs/models/deformable_detr/deformable_transformer.py", line 229, in forward src = self.forward_ffn(src) File "/kaggle/working/GenerateU/projects/DDETRS/ddetrs/models/deformable_detr/deformable_transformer.py", line 217, in forward_ffn src2 = self.linear2(self.dropout2(self.activation(self.linear1(src)))) File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1518, in _wrapped_call_impl return self._call_impl(args, kwargs) File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1527, in _call_impl return forward_call(*args, kwargs) File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/linear.py", line 114, in forward return F.linear(input, self.weight, self.bias) torch.cuda.OutOfMemoryError: CUDA out of memory. Tried to allocate 496.00 MiB. GPU 0 has a total capacty of 14.75 GiB of which 235.06 MiB is free. Process 20398 has 14.52 GiB memory in use. Of the allocated memory 14.06 GiB is allocated by PyTorch, and 330.42 MiB is reserved by PyTorch but unallocated. If reserved but unallocated memory is large try setting max_split_size_mb to avoid fragmentation. See documentation for Memory Management and PYTORCH_CUDA_ALLOC_CONF [05/17 10:24:21 d2.engine.hooks]: Total training time: 0:00:02 (0:00:00 on hooks) [05/17 10:24:21 d2.utils.events]: iter: 0 lr: N/A max_mem: 14396M Traceback (most recent call last): File "/kaggle/working/GenerateU/projects/DDETRS/train_net.py", line 249, in launch( File "/kaggle/working/GenerateU/detectron2/engine/launch.py", line 82, in launch main_func(args) File "/kaggle/working/GenerateU/projects/DDETRS/train_net.py", line 233, in main trainer.train() File "/kaggle/working/GenerateU/detectron2/engine/defaults.py", line 488, in train super().train(self.start_iter, self.max_iter) File "/kaggle/working/GenerateU/detectron2/engine/train_loop.py", line 149, in train self.run_step() File "/kaggle/working/GenerateU/detectron2/engine/defaults.py", line 498, in run_step self._trainer.run_step() File "/kaggle/working/GenerateU/detectron2/engine/train_loop.py", line 273, in run_step loss_dict = self.model(data) File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1518, in _wrapped_call_impl return self._call_impl(args, kwargs) File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1527, in _call_impl return forward_call(*args, kwargs) File "/kaggle/working/GenerateU/projects/DDETRS/ddetrs/ddetrs_vl_uni.py", line 197, in forward output, loss_dict = self.detr.forward(images, targets, self.criterion, train=True, clip_object_descriptions_features=clip_object_descriptions_features, dataset_source=dataset_source, ann_type=ann_type) File "/kaggle/working/GenerateU/projects/DDETRS/ddetrs/models/segmentation_condInst_new_encodfpn.py", line 143, in forward self.detr.transformer(srcs, masks, poses, query_embeds, mask_on=True) File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1518, in _wrapped_call_impl return self._call_impl(*args, *kwargs) File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1527, in _call_impl return forward_call(args, kwargs) File "/kaggle/working/GenerateU/projects/DDETRS/ddetrs/models/deformable_detr/deformable_transformer.py", line 153, in forward memory = self.encoder(src_flatten, spatial_shapes, level_start_index, valid_ratios, lvl_pos_embed_flatten, mask_flatten) File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1518, in _wrapped_call_impl return self._call_impl(*args, kwargs) File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1527, in _call_impl return forward_call(*args, *kwargs) File "/kaggle/working/GenerateU/projects/DDETRS/ddetrs/models/deformable_detr/deformable_transformer.py", line 259, in forward output = layer(output, pos, reference_points, spatial_shapes, level_start_index, padding_mask) File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1518, in _wrapped_call_impl return self._call_impl(args, kwargs) File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1527, in _call_impl return forward_call(*args, kwargs) File "/kaggle/working/GenerateU/projects/DDETRS/ddetrs/models/deformable_detr/deformable_transformer.py", line 229, in forward src = self.forward_ffn(src) File "/kaggle/working/GenerateU/projects/DDETRS/ddetrs/models/deformable_detr/deformable_transformer.py", line 217, in forward_ffn src2 = self.linear2(self.dropout2(self.activation(self.linear1(src)))) File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1518, in _wrapped_call_impl return self._call_impl(*args, *kwargs) File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1527, in _call_impl return forward_call(args, kwargs) File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/linear.py", line 114, in forward return F.linear(input, self.weight, self.bias) torch.cuda.OutOfMemoryError: CUDA out of memory. Tried to allocate 496.00 MiB. GPU 0 has a total capacty of 14.75 GiB of which 235.06 MiB is free. Process 20398 has 14.52 GiB memory in use. Of the allocated memory 14.06 GiB is allocated by PyTorch, and 330.42 MiB is reserved by PyTorch but unallocated. If reserved but unallocated memory is large try setting max_split_size_mb to avoid fragmentation. See documentation for Memory Management and PYTORCH_CUDA_ALLOC_CONF