Open Anthony0106 opened 1 week ago
I find the point is that the test imgs must have the same size with the trained size like 640*640. I think it is because of the tensor used in forward function as a python bool so when using trace model it cannot be changed from the trained params, right? And here's another question, can WTConv adapt to different size of input? And how can I acheive that? Looking for your reply!!
Hi @Anthony0106, Using different image sizes is possible. I have done so before and had no issues.
The only way I can think of image sizes being an issue is if you use too many levels compared to the input resolution. For example, let's say I have an input tensor of size [1,32,8,8] the spatial resolution is 8,8. If I use 2 levels of wavelet, the highest level convolution will operate on a 2x2 spatial resolution. Having 3 levels means it will operate on a 1x1 area (which is meaningless for spatial convolution), and problems can occur when I set 4 levels.
Other than that, I need more information to help solve the issue.
`# parameters nc: 8 # number of classes depth_multiple: 1.0 # model depth multiple width_multiple: 1.0 # layer channel multiple
anchors:
backbone:
[[-1, 1, Conv, [32, 3, 2, None, 1, nn.LeakyReLU(0.1)]], # 0-P1/2
[-1, 1, Conv, [64, 3, 2, None, 1, nn.LeakyReLU(0.1)]], # 1-P2/4
[-1, 1, Conv, [32, 1, 1, None, 1, nn.LeakyReLU(0.1)]], [-2, 1, Conv, [32, 1, 1, None, 1, nn.LeakyReLU(0.1)]], [-1, 1, WTConv2d, [32, 3, 1]], [-1, 1, WTConv2d, [32, 3, 1]], [[-1, -2, -3, -4], 1, Concat, [1]], [-1, 1, Conv, [64, 1, 1, None, 1, nn.LeakyReLU(0.1)]], # 7
[-1, 1, MP, []], # 8-P3/8 8080 [-1, 1, Conv, [64, 1, 1, None, 1, nn.LeakyReLU(0.1)]], [-2, 1, Conv, [64, 1, 1, None, 1, nn.LeakyReLU(0.1)]], [-1, 1, WTConv2d, [64, 3, 1]], [-1, 1, WTConv2d, [64, 3, 1]], [[-1, -2, -3, -4], 1, Concat, [1]], [-1, 1, Conv, [128, 1, 1, None, 1, nn.LeakyReLU(0.1)]], # 14 8080*128
[-1, 1, MP, []], # 15-P4/16 40*40 [-1, 1, Conv, [128, 1, 1, None, 1, nn.LeakyReLU(0.1)]], [-2, 1, Conv, [128, 1, 1, None, 1, nn.LeakyReLU(0.1)]], [-1, 1, WTConv2d, [128, 3, 1]], [-1, 1, WTConv2d, [128, 3, 1]], [[-1, -2, -3, -4], 1, Concat, [1]], [-1, 1, Conv, [256, 1, 1, None, 1, nn.LeakyReLU(0.1)]], # 21
[-1, 1, MP, []], # 22-P5/32 20*20 [-1, 1, Conv, [256, 1, 1, None, 1, nn.LeakyReLU(0.1)]], [-2, 1, Conv, [256, 1, 1, None, 1, nn.LeakyReLU(0.1)]], [-1, 1, WTConv2d, [256, 3, 1]], [-1, 1, WTConv2d, [256, 3, 1]], [[-1, -2, -3, -4], 1, Concat, [1]], [-1, 1, Conv, [512, 1, 1, None, 1, nn.LeakyReLU(0.1)]], # 28 ]
head: [[-1, 1, Conv, [256, 1, 1, None, 1, nn.LeakyReLU(0.1)]], # 2020256 [-2, 1, Conv, [256, 1, 1, None, 1, nn.LeakyReLU(0.1)]], [-1, 1, SP, [5]], [-2, 1, SP, [9]], [-3, 1, SP, [13]], [[-1, -2, -3, -4], 1, Concat, [1]], [-1, 1, Conv, [256, 1, 1, None, 1, nn.LeakyReLU(0.1)]], [[-1, -7], 1, Concat, [1]], [-1, 1, Conv, [256, 1, 1, None, 1, nn.LeakyReLU(0.1)]], # 37->
[-1, 1, Conv, [128, 1, 1, None, 1, nn.LeakyReLU(0.1)]], [-1, 1, nn.Upsample, [None, 2, 'nearest']], [21, 1, Conv, [128, 1, 1, None, 1, nn.LeakyReLU(0.1)]], # route backbone P4 4040256 改 [[-1, -2], 1, Concat, [1]],
[-1, 1, Conv, [64, 1, 1, None, 1, nn.LeakyReLU(0.1)]], [-2, 1, Conv, [64, 1, 1, None, 1, nn.LeakyReLU(0.1)]], [-1, 1, Conv, [64, 3, 1, None, 1, nn.LeakyReLU(0.1)]], [-1, 1, Conv, [64, 3, 1, None, 1, nn.LeakyReLU(0.1)]], [[-1, -2, -3, -4], 1, Concat, [1]], [-1, 1, Conv, [128, 1, 1, None, 1, nn.LeakyReLU(0.1)]], # 47->
[-1, 1, Conv, [64, 1, 1, None, 1, nn.LeakyReLU(0.1)]], [-1, 1, nn.Upsample, [None, 2, 'nearest']], [14, 1, Conv, [64, 1, 1, None, 1, nn.LeakyReLU(0.1)]], # route backbone P3 8080128 改 [[-1, -2], 1, Concat, [1]],
[-1, 1, Conv, [32, 1, 1, None, 1, nn.LeakyReLU(0.1)]], [-2, 1, Conv, [32, 1, 1, None, 1, nn.LeakyReLU(0.1)]], [-1, 1, Conv, [32, 3, 1, None, 1, nn.LeakyReLU(0.1)]], [-1, 1, Conv, [32, 3, 1, None, 1, nn.LeakyReLU(0.1)]], [[-1, -2, -3, -4], 1, Concat, [1]], [-1, 1, Conv, [64, 1, 1, None, 1, nn.LeakyReLU(0.1)]], # 57->
[-1, 1, Conv, [128, 3, 2, None, 1, nn.LeakyReLU(0.1)]], [[-1, 47], 1, Concat, [1]],
[-1, 1, Conv, [64, 1, 1, None, 1, nn.LeakyReLU(0.1)]], [-2, 1, Conv, [64, 1, 1, None, 1, nn.LeakyReLU(0.1)]], [-1, 1, Conv, [64, 3, 1, None, 1, nn.LeakyReLU(0.1)]], [-1, 1, Conv, [64, 3, 1, None, 1, nn.LeakyReLU(0.1)]], [[-1, -2, -3, -4], 1, Concat, [1]], [-1, 1, Conv, [128, 1, 1, None, 1, nn.LeakyReLU(0.1)]], # 65->
[-1, 1, Conv, [256, 3, 2, None, 1, nn.LeakyReLU(0.1)]], [[-1, 37], 1, Concat, [1]], # 37改为:
[-1, 1, Conv, [128, 1, 1, None, 1, nn.LeakyReLU(0.1)]], [-2, 1, Conv, [128, 1, 1, None, 1, nn.LeakyReLU(0.1)]], [-1, 1, Conv, [128, 3, 1, None, 1, nn.LeakyReLU(0.1)]], [-1, 1, Conv, [128, 3, 1, None, 1, nn.LeakyReLU(0.1)]], [[-1, -2, -3, -4], 1, Concat, [1]], [-1, 1, Conv, [256, 1, 1, None, 1, nn.LeakyReLU(0.1)]], # 73->
[57, 1, Conv, [128, 3, 1, None, 1, nn.LeakyReLU(0.1)]], [65, 1, Conv, [256, 3, 1, None, 1, nn.LeakyReLU(0.1)]], [73, 1, Conv, [512, 3, 1, None, 1, nn.LeakyReLU(0.1)]],
[[74,75,76], 1, IDetect, [nc, anchors]], # Detect(P3, P4, P5) ] ` Hi,professor, this is my yaml doc fitting to yolo-tiny model. I just replace those CBS blocks by WTConv where the input and output channel are supposed to be the same. And the params I set the same as the original Yolo-tiny, with kernel-size=3 and stride=1, I did nothing with the wt_levels param and it is supposed to be 1. The model is trained smoothly (but not a better performance compared with original yolo-tiny, though), but failed to run test.py for the error like this issue's title. the above is WTConv added to the model: the above two figures are the error information when i try to run test.py I notice that there is a warning in definition of forward() and I wonder whether it impact the unpredictible result.
Thanks for your comment and looking for your reply!
Unfortunately, I am unfamiliar with the YOLO codebase to give precise answers. However, since it trains well, the warning you showed might be the key to the problem. Did you try converting the boolean tensor to float? Does the tensor have to be boolean?
@shahaffind thanks for your answer, I tried to convert the tensor back, but it doesn't work. I might try other methods in future and if it works I will tell you then.
@shahaffind thanks for your answer, I tried to convert the tensor back, but it doesn't work. I might try other methods in future and if it works I will tell you then.
From the error it seems like some operations don't work well when converting to (or from) Boolean. Another option is that the error might come from the padding. When either of the spatial dimensions is odd (therefore can not be divided by 2) we add zero padding of size 1 to that dimension. This padding operation might not fit a Boolean tensor.
Dear professor, I'm sorry to bother you but I have a problem when using your TWConv2D to replace some basic Conv structure in Yolov7. The problem is: when I replace the particular layers by WTCon2D, I can train COCO dataset normally, but when I want to use test.py to see the training results, the program comes out an error saying size of tensor (21)must match the size of tensor b (20) at non-singleton dimension 3. Looking forward to your reply. the structure of my network and the error information are as below: