ZhengPeng7 / BiRefNet

[CAAI AIR'24] Bilateral Reference for High-Resolution Dichotomous Image Segmentation
https://www.birefnet.top
MIT License
1.01k stars 76 forks source link

Can't train smaller models, bb = 0, 1, 2 #70

Open SolicTous opened 3 weeks ago

SolicTous commented 3 weeks ago

Trying set one of 'vgg16', 'vgg16bn', 'resnet50', changing self.bb to 0 or 1 or 2, but getting RuntimeError: Given groups=1, weight of size [64, 3712, 3, 3], expected input[1, 1856, 128, 128] to have 3712 channels, but got 1856 channels instead even in case of changing input from 1024 to 512 and etc.

Set to 3 or 6 works as normal.

ZhengPeng7 commented 3 weeks ago

Hi, I'm sorry that only Swin and PVT are currently supported for the backbone (ref). These channel numbers cannot be defined by ourselves, which is defined by the backbone networks, no matter if it's VGG16 or Swin Transformer.