dyabel / detpro

Apache License 2.0
174 stars 28 forks source link

feature dimensions do not match #13

Closed cailk closed 2 years ago

cailk commented 2 years ago

Hi, I'm trying to run the following command in prepare.sh, CUDA_VISIBLE_DEVICES=6,7 ./tools/dist_train.sh configs/lvis/prompt_save_train_reuse.py 2 --work-dir workdirs/prompt_save_train and meet some errors like

Traceback (most recent call last):
  File "./tools/train.py", line 199, in <module>
    main()
  File "./tools/train.py", line 188, in main
    train_detector(
  File "/home/ubuntu/work/code/detpro/mmdet/apis/train.py", line 151, in train_detector
    runner.run(data_loaders, cfg.workflow, cfg.total_epochs)
  File "/home/ubuntu/anaconda3/envs/detpro/lib/python3.8/site-packages/mmcv/runner/epoch_based_runner.py", line 125, in run
    epoch_runner(data_loaders[i], **kwargs)
  File "/home/ubuntu/anaconda3/envs/detpro/lib/python3.8/site-packages/mmcv/runner/epoch_based_runner.py", line 50, in train
    self.run_iter(data_batch, train_mode=True)
  File "/home/ubuntu/anaconda3/envs/detpro/lib/python3.8/site-packages/mmcv/runner/epoch_based_runner.py", line 29, in run_iter
    outputs = self.model.train_step(data_batch, self.optimizer,
  File "/home/ubuntu/anaconda3/envs/detpro/lib/python3.8/site-packages/mmcv/parallel/distributed.py", line 46, in train_step
    output = self.module.train_step(*inputs[0], **kwargs[0])
  File "/home/ubuntu/work/code/detpro/mmdet/models/detectors/base.py", line 246, in train_step
    losses = self(**data)
  File "/home/ubuntu/anaconda3/envs/detpro/lib/python3.8/site-packages/torch/nn/modules/module.py", line 727, in _call_impl
    result = self.forward(*input, **kwargs)
  File "/home/ubuntu/anaconda3/envs/detpro/lib/python3.8/site-packages/mmcv/runner/fp16_utils.py", line 84, in new_func
    return old_func(*args, **kwargs)
  File "/home/ubuntu/work/code/detpro/mmdet/models/detectors/base.py", line 180, in forward
    return self.forward_train(img,img_no_normalize, img_metas, **kwargs)
  File "/home/ubuntu/work/code/detpro/mmdet/models/detectors/mask_rcnn.py", line 83, in forward_train
    roi_losses = self.roi_head.forward_train(x, img, img_no_normalize, img_metas, proposal_list,proposals,
  File "/home/ubuntu/work/code/detpro/mmdet/models/roi_heads/standard_roi_head_collect_reuse.py", line 269, in forward_train
    bbox_results = self._bbox_forward_train(x,img,sampling_results,proposals_pre_computed,
  File "/home/ubuntu/work/code/detpro/mmdet/models/roi_heads/standard_roi_head_collect_reuse.py", line 364, in _bbox_forward_train
    bbox_results, region_embeddings = self._bbox_forward(x, rois)
  File "/home/ubuntu/work/code/detpro/mmdet/models/roi_heads/standard_roi_head_collect_reuse.py", line 305, in _bbox_forward
    bbox_pred = self.bbox_head(bbox_feats)
  File "/home/ubuntu/anaconda3/envs/detpro/lib/python3.8/site-packages/torch/nn/modules/module.py", line 727, in _call_impl
    result = self.forward(*input, **kwargs)
  File "/home/ubuntu/work/code/detpro/mmdet/models/roi_heads/bbox_heads/convfc_bbox_head.py", line 214, in forward
    bbox_pred = self.fc_reg(x_reg) if self.with_reg else None
  File "/home/ubuntu/anaconda3/envs/detpro/lib/python3.8/site-packages/torch/nn/modules/module.py", line 727, in _call_impl
    result = self.forward(*input, **kwargs)
  File "/home/ubuntu/anaconda3/envs/detpro/lib/python3.8/site-packages/torch/nn/modules/linear.py", line 93, in forward
    return F.linear(input, self.weight, self.bias)
  File "/home/ubuntu/anaconda3/envs/detpro/lib/python3.8/site-packages/torch/nn/functional.py", line 1690, in linear
    ret = torch.addmm(bias, input, weight.t())
RuntimeError: mat1 dim 1 must match mat2 dim 0

It looks like the dimension of feature does not match the dimension of weight in the BBoxHead module. I guess because the shared layer is commented out in forward pass of the 'ConvFCBBoxHead' module. Can you help check the code?

dyabel commented 2 years ago

Hi, I'm trying to run the following command in prepare.sh, CUDA_VISIBLE_DEVICES=6,7 ./tools/dist_train.sh configs/lvis/prompt_save_train_reuse.py 2 --work-dir workdirs/prompt_save_train and meet some errors like

Traceback (most recent call last):
  File "./tools/train.py", line 199, in <module>
    main()
  File "./tools/train.py", line 188, in main
    train_detector(
  File "/home/ubuntu/work/code/detpro/mmdet/apis/train.py", line 151, in train_detector
    runner.run(data_loaders, cfg.workflow, cfg.total_epochs)
  File "/home/ubuntu/anaconda3/envs/detpro/lib/python3.8/site-packages/mmcv/runner/epoch_based_runner.py", line 125, in run
    epoch_runner(data_loaders[i], **kwargs)
  File "/home/ubuntu/anaconda3/envs/detpro/lib/python3.8/site-packages/mmcv/runner/epoch_based_runner.py", line 50, in train
    self.run_iter(data_batch, train_mode=True)
  File "/home/ubuntu/anaconda3/envs/detpro/lib/python3.8/site-packages/mmcv/runner/epoch_based_runner.py", line 29, in run_iter
    outputs = self.model.train_step(data_batch, self.optimizer,
  File "/home/ubuntu/anaconda3/envs/detpro/lib/python3.8/site-packages/mmcv/parallel/distributed.py", line 46, in train_step
    output = self.module.train_step(*inputs[0], **kwargs[0])
  File "/home/ubuntu/work/code/detpro/mmdet/models/detectors/base.py", line 246, in train_step
    losses = self(**data)
  File "/home/ubuntu/anaconda3/envs/detpro/lib/python3.8/site-packages/torch/nn/modules/module.py", line 727, in _call_impl
    result = self.forward(*input, **kwargs)
  File "/home/ubuntu/anaconda3/envs/detpro/lib/python3.8/site-packages/mmcv/runner/fp16_utils.py", line 84, in new_func
    return old_func(*args, **kwargs)
  File "/home/ubuntu/work/code/detpro/mmdet/models/detectors/base.py", line 180, in forward
    return self.forward_train(img,img_no_normalize, img_metas, **kwargs)
  File "/home/ubuntu/work/code/detpro/mmdet/models/detectors/mask_rcnn.py", line 83, in forward_train
    roi_losses = self.roi_head.forward_train(x, img, img_no_normalize, img_metas, proposal_list,proposals,
  File "/home/ubuntu/work/code/detpro/mmdet/models/roi_heads/standard_roi_head_collect_reuse.py", line 269, in forward_train
    bbox_results = self._bbox_forward_train(x,img,sampling_results,proposals_pre_computed,
  File "/home/ubuntu/work/code/detpro/mmdet/models/roi_heads/standard_roi_head_collect_reuse.py", line 364, in _bbox_forward_train
    bbox_results, region_embeddings = self._bbox_forward(x, rois)
  File "/home/ubuntu/work/code/detpro/mmdet/models/roi_heads/standard_roi_head_collect_reuse.py", line 305, in _bbox_forward
    bbox_pred = self.bbox_head(bbox_feats)
  File "/home/ubuntu/anaconda3/envs/detpro/lib/python3.8/site-packages/torch/nn/modules/module.py", line 727, in _call_impl
    result = self.forward(*input, **kwargs)
  File "/home/ubuntu/work/code/detpro/mmdet/models/roi_heads/bbox_heads/convfc_bbox_head.py", line 214, in forward
    bbox_pred = self.fc_reg(x_reg) if self.with_reg else None
  File "/home/ubuntu/anaconda3/envs/detpro/lib/python3.8/site-packages/torch/nn/modules/module.py", line 727, in _call_impl
    result = self.forward(*input, **kwargs)
  File "/home/ubuntu/anaconda3/envs/detpro/lib/python3.8/site-packages/torch/nn/modules/linear.py", line 93, in forward
    return F.linear(input, self.weight, self.bias)
  File "/home/ubuntu/anaconda3/envs/detpro/lib/python3.8/site-packages/torch/nn/functional.py", line 1690, in linear
    ret = torch.addmm(bias, input, weight.t())
RuntimeError: mat1 dim 1 must match mat2 dim 0

It looks like the dimension of feature does not match the dimension of weight in the BBoxHead module. I guess because the shared layer is commented out in forward pass of the 'ConvFCBBoxHead' module. Can you help check the code?

I have fixed, please check.

cailk commented 2 years ago

I have fixed, please check.

Ok, I'll check it. BTW, I already run the prepare.sh and generated the 'lvis_clip_image_proposal_embedding' folder as below. Is this right? And will this folder be used if I just want to reproduce the results of ViLD?

lvis_clip_image_proposal_embedding
├── train
│   └── train2017
└── val
    ├── train2017
    └── val2017
dyabel commented 2 years ago

I have fixed, please check.

Ok, I'll check it. BTW, I already run the prepare.sh and generated the 'lvis_clip_image_proposal_embedding' folder as below. Is this right? And will this folder be used if I just want to reproduce the results of ViLD?

lvis_clip_image_proposal_embedding
├── train
│   └── train2017
└── val
    ├── train2017
    └── val2017

1) It is correct. 2) The zip file of this folder will be used.