zhihou7 / BatchFormer

CVPR2022, BatchFormer: Learning to Explore Sample Relationships for Robust Representation Learning, https://arxiv.org/abs/2203.01522
242 stars 20 forks source link

Error reporting issue when adding batchformerv2 to other class detr models #26

Open cbn3 opened 10 months ago

cbn3 commented 10 months ago

Sorry to bother you.The error message is as follows. How can I resolve it? root@i-r5mjznu9:/workspace/cbn/DINO# /opt/conda/lib/python3.8/site-packages/torch/tensor.py:559: UserWarning: floor_divide is deprecated, and will be removed in a future version of pytorch. It currently rounds toward 0 (like the 'trunc' function NOT 'floor'). This results in incorrect rounding for negative values. To keep the current behavior, use torch.div(a, b, rounding_mode='trunc'), or for actual floor division, use torch.div(a, b, rounding_mode='floor'). (Triggered internally at ../aten/src/ATen/native/BinaryOps.cpp:335.) return torch.floor_divide(self, other) Traceback (most recent call last): File "main.py", line 423, in main(args) File "main.py", line 309, in main train_stats = train_one_epoch( File "/workspace/cbn/DINO/engine.py", line 48, in train_one_epoch outputs = model(samples, targets) File "/opt/conda/lib/python3.8/site-packages/torch/nn/modules/module.py", line 889, in _call_impl result = self.forward(*input, kwargs) File "/opt/conda/lib/python3.8/site-packages/torch/nn/parallel/distributed.py", line 707, in forward output = self.module(*inputs[0], *kwargs[0]) File "/opt/conda/lib/python3.8/site-packages/torch/nn/modules/module.py", line 889, in _call_impl result = self.forward(input, kwargs) File "/workspace/cbn/DINO/models/dino/dino.py", line 270, in forward hs, reference, hs_enc, ref_enc, init_box_proposal = self.transformer(srcs, masks, input_query_bbox, poss,input_query_label,attn_mask) File "/opt/conda/lib/python3.8/site-packages/torch/nn/modules/module.py", line 889, in _call_impl result = self.forward(*input, kwargs) File "/workspace/cbn/DINO/models/dino/deformable_transformer.py", line 343, in forward output_memory, output_proposals = gen_encoder_output_proposals(memory, mask_flatten, spatial_shapes, input_hw) File "/workspace/cbn/DINO/models/dino/utils.py", line 31, in gen_encoder_output_proposals maskflatten = memory_padding_mask[:, _cur:(cur + H W)].view(N, H, W, 1) RuntimeError: shape '[8, 92, 92, 1]' is invalid for input of size 33856 /opt/conda/lib/python3.8/site-packages/torch/tensor.py:559: UserWarning: floor_divide is deprecated, and will be removed in a future version of pytorch. It currently rounds toward 0 (like the 'trunc' function NOT 'floor'). This results in incorrect rounding for negative values. To keep the current behavior, use torch.div(a, b, rounding_mode='trunc'), or for actual floor division, use torch.div(a, b, rounding_mode='floor'). (Triggered internally at ../aten/src/ATen/native/BinaryOps.cpp:335.) return torch.floor_divide(self, other) Traceback (most recent call last): File "main.py", line 423, in main(args) File "main.py", line 309, in main train_stats = train_one_epoch( File "/workspace/cbn/DINO/engine.py", line 48, in train_one_epoch outputs = model(samples, targets) File "/opt/conda/lib/python3.8/site-packages/torch/nn/modules/module.py", line 889, in _call_impl result = self.forward(input, kwargs) File "/opt/conda/lib/python3.8/site-packages/torch/nn/parallel/distributed.py", line 707, in forward output = self.module(*inputs[0], kwargs[0]) File "/opt/conda/lib/python3.8/site-packages/torch/nn/modules/module.py", line 889, in _call_impl result = self.forward(*input, *kwargs) File "/workspace/cbn/DINO/models/dino/dino.py", line 270, in forward hs, reference, hs_enc, ref_enc, init_box_proposal = self.transformer(srcs, masks, input_query_bbox, poss,input_query_label,attn_mask) File "/opt/conda/lib/python3.8/site-packages/torch/nn/modules/module.py", line 889, in _call_impl result = self.forward(input, kwargs) File "/workspace/cbn/DINO/models/dino/deformable_transformer.py", line 343, in forward output_memory, output_proposals = gen_encoder_output_proposals(memory, mask_flatten, spatial_shapes, input_hw) File "/workspace/cbn/DINO/models/dino/utils.py", line 31, in gen_encoder_output_proposals maskflatten = memory_padding_mask[:, _cur:(cur + H * W)].view(N, H, W, 1) RuntimeError: shape '[8, 139, 96, 1]' is invalid for input of size 53376

zhihou7 commented 10 months ago

Do you set the value of N to 8? it might be 4.

cbn3 commented 10 months ago

Do you set the value of N to 8? it might be 4.

Thank you for your prompt response. After trying your suggestion, the following error was reported. How can I resolve it? Traceback (most recent call last): File "main.py", line 424, in main(args) File "main.py", line 233, in main model_without_ddp.load_state_dict(checkpoint['model']) File "/opt/conda/lib/python3.8/site-packages/torch/nn/modules/module.py", line 1223, in load_state_dict raise RuntimeError('Error(s) in loading state_dict for {}:\n\t{}'.format( RuntimeError: Error(s) in loading state_dict for DINO: Missing key(s) in state_dict: "transformer.bf.0.self_attn.in_proj_weight", "transformer.bf.0.self_attn.in_proj_bias", "transformer.bf.0.self_attn.out_proj.weight", "transformer.bf.0.self_attn.out_proj.bias", "transformer.bf.0.linear1.weight", "transformer.bf.0.linear1.bias", "transformer.bf.0.linear2.weight", "transformer.bf.0.linear2.bias", "transformer.bf.0.norm1.weight", "transformer.bf.0.norm1.bias", "transformer.bf.0.norm2.weight", "transformer.bf.0.norm2.bias", "transformer.encoder.bf.0.self_attn.in_proj_weight", "transformer.encoder.bf.0.self_attn.in_proj_bias", "transformer.encoder.bf.0.self_attn.out_proj.weight", "transformer.encoder.bf.0.self_attn.out_proj.bias", "transformer.encoder.bf.0.linear1.weight", "transformer.encoder.bf.0.linear1.bias", "transformer.encoder.bf.0.linear2.weight", "transformer.encoder.bf.0.linear2.bias", "transformer.encoder.bf.0.norm1.weight", "transformer.encoder.bf.0.norm1.bias", "transformer.encoder.bf.0.norm2.weight", "transformer.encoder.bf.0.norm2.bias". size mismatch for transformer.encoder.layers.0.self_attn.sampling_offsets.weight: copying a param with shape torch.Size([256, 256]) from checkpoint, the shape in current model is torch.Size([128, 256]). size mismatch for transformer.encoder.layers.0.self_attn.sampling_offsets.bias: copying a param with shape torch.Size([256]) from checkpoint, the shape in current model is torch.Size([128]). size mismatch for transformer.encoder.layers.0.self_attn.attention_weights.weight: copying a param with shape torch.Size([128, 256]) from checkpoint, the shape in current model is torch.Size([64, 256]). size mismatch for transformer.encoder.layers.0.self_attn.attention_weights.bias: copying a param with shape torch.Size([128]) from checkpoint, the shape in current model is torch.Size([64]). size mismatch for transformer.encoder.layers.1.self_attn.sampling_offsets.weight: copying a param with shape torch.Size([256, 256]) from checkpoint, the shape in current model is torch.Size([128, 256]). size mismatch for transformer.encoder.layers.1.self_attn.sampling_offsets.bias: copying a param with shape torch.Size([256]) from checkpoint, the shape in current model is torch.Size([128]). size mismatch for transformer.encoder.layers.1.self_attn.attention_weights.weight: copying a param with shape torch.Size([128, 256]) from checkpoint, the shape in current model is torch.Size([64, 256]). size mismatch for transformer.encoder.layers.1.self_attn.attention_weights.bias: copying a param with shape torch.Size([128]) from checkpoint, the shape in current model is torch.Size([64]). size mismatch for transformer.encoder.layers.2.self_attn.sampling_offsets.weight: copying a param with shape torch.Size([256, 256]) from checkpoint, the shape in current model is torch.Size([128, 256]). size mismatch for transformer.encoder.layers.2.self_attn.sampling_offsets.bias: copying a param with shape torch.Size([256]) from checkpoint, the shape in current model is torch.Size([128]). size mismatch for transformer.encoder.layers.2.self_attn.attention_weights.weight: copying a param with shape torch.Size([128, 256]) from checkpoint, the shape in current model is torch.Size([64, 256]). size mismatch for transformer.encoder.layers.2.self_attn.attention_weights.bias: copying a param with shape torch.Size([128]) from checkpoint, the shape in current model is torch.Size([64]). size mismatch for transformer.encoder.layers.3.self_attn.sampling_offsets.weight: copying a param with shape torch.Size([256, 256]) from checkpoint, the shape in current model is torch.Size([128, 256]). size mismatch for transformer.encoder.layers.3.self_attn.sampling_offsets.bias: copying a param with shape torch.Size([256]) from checkpoint, the shape in current model is torch.Size([128]). size mismatch for transformer.encoder.layers.3.self_attn.attention_weights.weight: copying a param with shape torch.Size([128, 256]) from checkpoint, the shape in current model is torch.Size([64, 256]). size mismatch for transformer.encoder.layers.3.self_attn.attention_weights.bias: copying a param with shape torch.Size([128]) from checkpoint, the shape in current model is torch.Size([64]). size mismatch for transformer.encoder.layers.4.self_attn.sampling_offsets.weight: copying a param with shape torch.Size([256, 256]) from checkpoint, the shape in current model is torch.Size([128, 256]). size mismatch for transformer.encoder.layers.4.self_attn.sampling_offsets.bias: copying a param with shape torch.Size([256]) from checkpoint, the shape in current model is torch.Size([128]). size mismatch for transformer.encoder.layers.4.self_attn.attention_weights.weight: copying a param with shape torch.Size([128, 256]) from checkpoint, the shape in current model is torch.Size([64, 256]). size mismatch for transformer.encoder.layers.4.self_attn.attention_weights.bias: copying a param with shape torch.Size([128]) from checkpoint, the shape in current model is torch.Size([64]). size mismatch for transformer.encoder.layers.5.self_attn.sampling_offsets.weight: copying a param with shape torch.Size([256, 256]) from checkpoint, the shape in current model is torch.Size([128, 256]). size mismatch for transformer.encoder.layers.5.self_attn.sampling_offsets.bias: copying a param with shape torch.Size([256]) from checkpoint, the shape in current model is torch.Size([128]). size mismatch for transformer.encoder.layers.5.self_attn.attention_weights.weight: copying a param with shape torch.Size([128, 256]) from checkpoint, the shape in current model is torch.Size([64, 256]). size mismatch for transformer.encoder.layers.5.self_attn.attention_weights.bias: copying a param with shape torch.Size([128]) from checkpoint, the shape in current model is torch.Size([64]). size mismatch for transformer.decoder.layers.0.cross_attn.sampling_offsets.weight: copying a param with shape torch.Size([256, 256]) from checkpoint, the shape in current model is torch.Size([128, 256]). size mismatch for transformer.decoder.layers.0.cross_attn.sampling_offsets.bias: copying a param with shape torch.Size([256]) from checkpoint, the shape in current model is torch.Size([128]). size mismatch for transformer.decoder.layers.0.cross_attn.attention_weights.weight: copying a param with shape torch.Size([128, 256]) from checkpoint, the shape in current model is torch.Size([64, 256]). size mismatch for transformer.decoder.layers.0.cross_attn.attention_weights.bias: copying a param with shape torch.Size([128]) from checkpoint, the shape in current model is torch.Size([64]). size mismatch for transformer.decoder.layers.1.cross_attn.sampling_offsets.weight: copying a param with shape torch.Size([256, 256]) from checkpoint, the shape in current model is torch.Size([128, 256]). size mismatch for transformer.decoder.layers.1.cross_attn.sampling_offsets.bias: copying a param with shape torch.Size([256]) from checkpoint, the shape in current model is torch.Size([128]). size mismatch for transformer.decoder.layers.1.cross_attn.attention_weights.weight: copying a param with shape torch.Size([128, 256]) from checkpoint, the shape in current model is torch.Size([64, 256]). size mismatch for transformer.decoder.layers.1.cross_attn.attention_weights.bias: copying a param with shape torch.Size([128]) from checkpoint, the shape in current model is torch.Size([64]). size mismatch for transformer.decoder.layers.2.cross_attn.sampling_offsets.weight: copying a param with shape torch.Size([256, 256]) from checkpoint, the shape in current model is torch.Size([128, 256]). size mismatch for transformer.decoder.layers.2.cross_attn.sampling_offsets.bias: copying a param with shape torch.Size([256]) from checkpoint, the shape in current model is torch.Size([128]). size mismatch for transformer.decoder.layers.2.cross_attn.attention_weights.weight: copying a param with shape torch.Size([128, 256]) from checkpoint, the shape in current model is torch.Size([64, 256]). size mismatch for transformer.decoder.layers.2.cross_attn.attention_weights.bias: copying a param with shape torch.Size([128]) from checkpoint, the shape in current model is torch.Size([64]). size mismatch for transformer.decoder.layers.3.cross_attn.sampling_offsets.weight: copying a param with shape torch.Size([256, 256]) from checkpoint, the shape in current model is torch.Size([128, 256]). size mismatch for transformer.decoder.layers.3.cross_attn.sampling_offsets.bias: copying a param with shape torch.Size([256]) from checkpoint, the shape in current model is torch.Size([128]). size mismatch for transformer.decoder.layers.3.cross_attn.attention_weights.weight: copying a param with shape torch.Size([128, 256]) from checkpoint, the shape in current model is torch.Size([64, 256]). size mismatch for transformer.decoder.layers.3.cross_attn.attention_weights.bias: copying a param with shape torch.Size([128]) from checkpoint, the shape in current model is torch.Size([64]). size mismatch for transformer.decoder.layers.4.cross_attn.sampling_offsets.weight: copying a param with shape torch.Size([256, 256]) from checkpoint, the shape in current model is torch.Size([128, 256]). size mismatch for transformer.decoder.layers.4.cross_attn.sampling_offsets.bias: copying a param with shape torch.Size([256]) from checkpoint, the shape in current model is torch.Size([128]). size mismatch for transformer.decoder.layers.4.cross_attn.attention_weights.weight: copying a param with shape torch.Size([128, 256]) from checkpoint, the shape in current model is torch.Size([64, 256]). size mismatch for transformer.decoder.layers.4.cross_attn.attention_weights.bias: copying a param with shape torch.Size([128]) from checkpoint, the shape in current model is torch.Size([64]). size mismatch for transformer.decoder.layers.5.cross_attn.sampling_offsets.weight: copying a param with shape torch.Size([256, 256]) from checkpoint, the shape in current model is torch.Size([128, 256]). size mismatch for transformer.decoder.layers.5.cross_attn.sampling_offsets.bias: copying a param with shape torch.Size([256]) from checkpoint, the shape in current model is torch.Size([128]). size mismatch for transformer.decoder.layers.5.cross_attn.attention_weights.weight: copying a param with shape torch.Size([128, 256]) from checkpoint, the shape in current model is torch.Size([64, 256]). size mismatch for transformer.decoder.layers.5.cross_attn.attention_weights.bias: copying a param with shape torch.Size([128]) from checkpoint, the shape in current model is torch.Size([64]). Traceback (most recent call last): File "main.py", line 424, in main(args) File "main.py", line 233, in main model_without_ddp.load_state_dict(checkpoint['model']) File "/opt/conda/lib/python3.8/site-packages/torch/nn/modules/module.py", line 1223, in load_state_dict raise RuntimeError('Error(s) in loading state_dict for {}:\n\t{}'.format( RuntimeError: Error(s) in loading state_dict for DINO: Missing key(s) in state_dict: "transformer.bf.0.self_attn.in_proj_weight", "transformer.bf.0.self_attn.in_proj_bias", "transformer.bf.0.self_attn.out_proj.weight", "transformer.bf.0.self_attn.out_proj.bias", "transformer.bf.0.linear1.weight", "transformer.bf.0.linear1.bias", "transformer.bf.0.linear2.weight", "transformer.bf.0.linear2.bias", "transformer.bf.0.norm1.weight", "transformer.bf.0.norm1.bias", "transformer.bf.0.norm2.weight", "transformer.bf.0.norm2.bias", "transformer.encoder.bf.0.self_attn.in_proj_weight", "transformer.encoder.bf.0.self_attn.in_proj_bias", "transformer.encoder.bf.0.self_attn.out_proj.weight", "transformer.encoder.bf.0.self_attn.out_proj.bias", "transformer.encoder.bf.0.linear1.weight", "transformer.encoder.bf.0.linear1.bias", "transformer.encoder.bf.0.linear2.weight", "transformer.encoder.bf.0.linear2.bias", "transformer.encoder.bf.0.norm1.weight", "transformer.encoder.bf.0.norm1.bias", "transformer.encoder.bf.0.norm2.weight", "transformer.encoder.bf.0.norm2.bias". size mismatch for transformer.encoder.layers.0.self_attn.sampling_offsets.weight: copying a param with shape torch.Size([256, 256]) from checkpoint, the shape in current model is torch.Size([128, 256]). size mismatch for transformer.encoder.layers.0.self_attn.sampling_offsets.bias: copying a param with shape torch.Size([256]) from checkpoint, the shape in current model is torch.Size([128]). size mismatch for transformer.encoder.layers.0.self_attn.attention_weights.weight: copying a param with shape torch.Size([128, 256]) from checkpoint, the shape in current model is torch.Size([64, 256]). size mismatch for transformer.encoder.layers.0.self_attn.attention_weights.bias: copying a param with shape torch.Size([128]) from checkpoint, the shape in current model is torch.Size([64]). size mismatch for transformer.encoder.layers.1.self_attn.sampling_offsets.weight: copying a param with shape torch.Size([256, 256]) from checkpoint, the shape in current model is torch.Size([128, 256]). size mismatch for transformer.encoder.layers.1.self_attn.sampling_offsets.bias: copying a param with shape torch.Size([256]) from checkpoint, the shape in current model is torch.Size([128]). size mismatch for transformer.encoder.layers.1.self_attn.attention_weights.weight: copying a param with shape torch.Size([128, 256]) from checkpoint, the shape in current model is torch.Size([64, 256]). size mismatch for transformer.encoder.layers.1.self_attn.attention_weights.bias: copying a param with shape torch.Size([128]) from checkpoint, the shape in current model is torch.Size([64]). size mismatch for transformer.encoder.layers.2.self_attn.sampling_offsets.weight: copying a param with shape torch.Size([256, 256]) from checkpoint, the shape in current model is torch.Size([128, 256]). size mismatch for transformer.encoder.layers.2.self_attn.sampling_offsets.bias: copying a param with shape torch.Size([256]) from checkpoint, the shape in current model is torch.Size([128]). size mismatch for transformer.encoder.layers.2.self_attn.attention_weights.weight: copying a param with shape torch.Size([128, 256]) from checkpoint, the shape in current model is torch.Size([64, 256]). size mismatch for transformer.encoder.layers.2.self_attn.attention_weights.bias: copying a param with shape torch.Size([128]) from checkpoint, the shape in current model is torch.Size([64]). size mismatch for transformer.encoder.layers.3.self_attn.sampling_offsets.weight: copying a param with shape torch.Size([256, 256]) from checkpoint, the shape in current model is torch.Size([128, 256]). size mismatch for transformer.encoder.layers.3.self_attn.sampling_offsets.bias: copying a param with shape torch.Size([256]) from checkpoint, the shape in current model is torch.Size([128]). size mismatch for transformer.encoder.layers.3.self_attn.attention_weights.weight: copying a param with shape torch.Size([128, 256]) from checkpoint, the shape in current model is torch.Size([64, 256]). size mismatch for transformer.encoder.layers.3.self_attn.attention_weights.bias: copying a param with shape torch.Size([128]) from checkpoint, the shape in current model is torch.Size([64]). size mismatch for transformer.encoder.layers.4.self_attn.sampling_offsets.weight: copying a param with shape torch.Size([256, 256]) from checkpoint, the shape in current model is torch.Size([128, 256]). size mismatch for transformer.encoder.layers.4.self_attn.sampling_offsets.bias: copying a param with shape torch.Size([256]) from checkpoint, the shape in current model is torch.Size([128]). size mismatch for transformer.encoder.layers.4.self_attn.attention_weights.weight: copying a param with shape torch.Size([128, 256]) from checkpoint, the shape in current model is torch.Size([64, 256]). size mismatch for transformer.encoder.layers.4.self_attn.attention_weights.bias: copying a param with shape torch.Size([128]) from checkpoint, the shape in current model is torch.Size([64]). size mismatch for transformer.encoder.layers.5.self_attn.sampling_offsets.weight: copying a param with shape torch.Size([256, 256]) from checkpoint, the shape in current model is torch.Size([128, 256]). size mismatch for transformer.encoder.layers.5.self_attn.sampling_offsets.bias: copying a param with shape torch.Size([256]) from checkpoint, the shape in current model is torch.Size([128]). size mismatch for transformer.encoder.layers.5.self_attn.attention_weights.weight: copying a param with shape torch.Size([128, 256]) from checkpoint, the shape in current model is torch.Size([64, 256]). size mismatch for transformer.encoder.layers.5.self_attn.attention_weights.bias: copying a param with shape torch.Size([128]) from checkpoint, the shape in current model is torch.Size([64]). size mismatch for transformer.decoder.layers.0.cross_attn.sampling_offsets.weight: copying a param with shape torch.Size([256, 256]) from checkpoint, the shape in current model is torch.Size([128, 256]). size mismatch for transformer.decoder.layers.0.cross_attn.sampling_offsets.bias: copying a param with shape torch.Size([256]) from checkpoint, the shape in current model is torch.Size([128]). size mismatch for transformer.decoder.layers.0.cross_attn.attention_weights.weight: copying a param with shape torch.Size([128, 256]) from checkpoint, the shape in current model is torch.Size([64, 256]). size mismatch for transformer.decoder.layers.0.cross_attn.attention_weights.bias: copying a param with shape torch.Size([128]) from checkpoint, the shape in current model is torch.Size([64]). size mismatch for transformer.decoder.layers.1.cross_attn.sampling_offsets.weight: copying a param with shape torch.Size([256, 256]) from checkpoint, the shape in current model is torch.Size([128, 256]). size mismatch for transformer.decoder.layers.1.cross_attn.sampling_offsets.bias: copying a param with shape torch.Size([256]) from checkpoint, the shape in current model is torch.Size([128]). size mismatch for transformer.decoder.layers.1.cross_attn.attention_weights.weight: copying a param with shape torch.Size([128, 256]) from checkpoint, the shape in current model is torch.Size([64, 256]). size mismatch for transformer.decoder.layers.1.cross_attn.attention_weights.bias: copying a param with shape torch.Size([128]) from checkpoint, the shape in current model is torch.Size([64]). size mismatch for transformer.decoder.layers.2.cross_attn.sampling_offsets.weight: copying a param with shape torch.Size([256, 256]) from checkpoint, the shape in current model is torch.Size([128, 256]). size mismatch for transformer.decoder.layers.2.cross_attn.sampling_offsets.bias: copying a param with shape torch.Size([256]) from checkpoint, the shape in current model is torch.Size([128]). size mismatch for transformer.decoder.layers.2.cross_attn.attention_weights.weight: copying a param with shape torch.Size([128, 256]) from checkpoint, the shape in current model is torch.Size([64, 256]). size mismatch for transformer.decoder.layers.2.cross_attn.attention_weights.bias: copying a param with shape torch.Size([128]) from checkpoint, the shape in current model is torch.Size([64]). size mismatch for transformer.decoder.layers.3.cross_attn.sampling_offsets.weight: copying a param with shape torch.Size([256, 256]) from checkpoint, the shape in current model is torch.Size([128, 256]). size mismatch for transformer.decoder.layers.3.cross_attn.sampling_offsets.bias: copying a param with shape torch.Size([256]) from checkpoint, the shape in current model is torch.Size([128]). size mismatch for transformer.decoder.layers.3.cross_attn.attention_weights.weight: copying a param with shape torch.Size([128, 256]) from checkpoint, the shape in current model is torch.Size([64, 256]). size mismatch for transformer.decoder.layers.3.cross_attn.attention_weights.bias: copying a param with shape torch.Size([128]) from checkpoint, the shape in current model is torch.Size([64]). size mismatch for transformer.decoder.layers.4.cross_attn.sampling_offsets.weight: copying a param with shape torch.Size([256, 256]) from checkpoint, the shape in current model is torch.Size([128, 256]). size mismatch for transformer.decoder.layers.4.cross_attn.sampling_offsets.bias: copying a param with shape torch.Size([256]) from checkpoint, the shape in current model is torch.Size([128]). size mismatch for transformer.decoder.layers.4.cross_attn.attention_weights.weight: copying a param with shape torch.Size([128, 256]) from checkpoint, the shape in current model is torch.Size([64, 256]). size mismatch for transformer.decoder.layers.4.cross_attn.attention_weights.bias: copying a param with shape torch.Size([128]) from checkpoint, the shape in current model is torch.Size([64]). size mismatch for transformer.decoder.layers.5.cross_attn.sampling_offsets.weight: copying a param with shape torch.Size([256, 256]) from checkpoint, the shape in current model is torch.Size([128, 256]). size mismatch for transformer.decoder.layers.5.cross_attn.sampling_offsets.bias: copying a param with shape torch.Size([256]) from checkpoint, the shape in current model is torch.Size([128]). size mismatch for transformer.decoder.layers.5.cross_attn.attention_weights.weight: copying a param with shape torch.Size([128, 256]) from checkpoint, the shape in current model is torch.Size([64, 256]). size mismatch for transformer.decoder.layers.5.cross_attn.attention_weights.bias: copying a param with shape torch.Size([128]) from checkpoint, the shape in current model is torch.Size([64]).

zhihou7 commented 10 months ago

N in This line

mask_flatten = memory_padding_mask[:, cur:(cur + H * W)].view(N, H, W_, 1)
cbn3 commented 10 months ago

N in This line

mask_flatten = memory_padding_mask[:, cur:(cur + H * W)].view(N, H, W_, 1)
截屏2023-09-07 16 42 01

Sorry to bother you.When I add default=[] to the parameter insert_idx and run main.py without assigning insert_idx a value of 0, the initial error is not reported. However, in this case, has batchformerv2 been successfully added to the model?

zhihou7 commented 7 months ago

Hi, Sorry for getting you late. According to the code, in detr, I set the default value in here