I tried to run the densecl model's pretraining. In the original paper, the author said the "num_grid" in "DenseCL_neck" could be set as "1,3,5,7".
However, when I run the command:
"
python tools/train.py configs/selfsup/densecl/densecl_resnet50_8xb32-coslr-200e_in1k.py
"
And modified the densecl.py as below:
"
neck=dict(
type='DenseCLNeck',
in_channels=2048,
hid_channels=2048,
out_channels=128,
num_grid=4),
"
It raised the error that:
"
File "/home/ls/mmselfsupv1/mmselfsup/models/algorithms/base.py", line 127, in forward
return self.loss(inputs, data_samples)
File "/home/ls/mmselfsupv1/mmselfsup/models/algorithms/densecl.py", line 199, in loss
densecl_sim_q = (q_grid * indexed_k_grid).sum(1) # NxS^2
RuntimeError: The size of tensor a (16) must match the size of tensor b (49) at non-singleton dimension 2
"
And then I found that in the file "mmselfsup/models/algorithms/densecl.py" , the code is:
"backbone_sim_matrix = torch.matmul(q_b.permute(0,2,1), k_b)"
In my opinion, as the original paper said :"The F1 and F2 are first downsampled to have the spatial shape of S × S by
an adaptive average pooling, and then used to calculate the cosine similarity matrix.", maybe it shouldn't use original q_b and k_b to calculate the "backbone_sim_matrix".
It should be "downsampled to have the spatial shape of S × S".
I'm not sure my opinion is correct, if not, sorry for my wrong opinion. If the code is right, I want to know how to set the grid size to different number without raising the error mentioned above.
Branch
1.x branch (1.x version, such as
v1.0.0rc2
, ordev-1.x
branch)Prerequisite
Environment
TorchVision : 0.6.1 +cu111 OpenCV:4.6.0 MMEngine : 0.5.0 MMCV:2.0.0rc2 MMCV Compiler : GCC 7.3 MMCV CUDA Compiler: 11.1 MMSelfSup : 1.0.0rc6+48aba52
Describe the bug
I tried to run the densecl model's pretraining. In the original paper, the author said the "num_grid" in "DenseCL_neck" could be set as "1,3,5,7". However, when I run the command: " python tools/train.py configs/selfsup/densecl/densecl_resnet50_8xb32-coslr-200e_in1k.py " And modified the densecl.py as below: " neck=dict( type='DenseCLNeck', in_channels=2048, hid_channels=2048, out_channels=128, num_grid=4), "
It raised the error that: " File "/home/ls/mmselfsupv1/mmselfsup/models/algorithms/base.py", line 127, in forward return self.loss(inputs, data_samples) File "/home/ls/mmselfsupv1/mmselfsup/models/algorithms/densecl.py", line 199, in loss densecl_sim_q = (q_grid * indexed_k_grid).sum(1) # NxS^2 RuntimeError: The size of tensor a (16) must match the size of tensor b (49) at non-singleton dimension 2 "
And then I found that in the file "mmselfsup/models/algorithms/densecl.py" , the code is: "backbone_sim_matrix = torch.matmul(q_b.permute(0,2,1), k_b)"
In my opinion, as the original paper said :"The F1 and F2 are first downsampled to have the spatial shape of S × S by an adaptive average pooling, and then used to calculate the cosine similarity matrix.", maybe it shouldn't use original q_b and k_b to calculate the "backbone_sim_matrix". It should be "downsampled to have the spatial shape of S × S".
I'm not sure my opinion is correct, if not, sorry for my wrong opinion. If the code is right, I want to know how to set the grid size to different number without raising the error mentioned above.
Thanks for your help.
Reproduces the problem - code sample
No response
Reproduces the problem - command or script
No response
Reproduces the problem - error message
No response
Additional information
No response