2 out of 8 backbones throw size errors

Hi, apologies for creating multiple issues, but really liking this model!

At a default resolution of 1280x736 I'm able to train the following backbones without error:

resdcn_18
resdcn_34
resfpndcn_34
dla_34
hrnet_18
cspdarknet_53

However resdcn_50 and hrnet_32 both throw sizing errors:

resdcn_50:

RuntimeError: The size of tensor a (216) must match the size of tensor b (215) at non-singleton dimension 3

hrnet_32:

RuntimeError: Given transposed=1, weight of size 36 36 2 2, expected input[4, 64, 68, 120] to have 36 channels, but got 64 channels instead

Do you happen to recognize what the problem might be for either of these? Not sure if it's simply an unallowable input size or if network modifications are required.

Also I'm curious if you have a backbone recommendation for best detecting very small objects (areas of ~50 pixels) which also have high counts and are very clustered together (tinyperson crowd counting)? And is there value in upsampling my 1280x720 resolution videos to something like 1920x1088 to assist small obj detection?

Really appreciate any advice you have time to offer!

CaptainEven / MCMOT

2 out of 8 backbones throw size errors #47