mberkay0 / pretrained-backbones-unet

A PyTorch-based Python library with UNet architecture and multiple backbones for Image Semantic Segmentation.
MIT License
46 stars 9 forks source link

Unable to train on small patches (64x48) #4

Open bc-bytes opened 1 year ago

bc-bytes commented 1 year ago

I am trying to train using the efficientnet_b0 backbone using an image size of 64x48, but get the following error:

  File "/home/seg/bbunet/train.py", line 85, in <module>
    trainer.fit(train_loader, val_loader)
  File "/home/seg/bbunet/backbones_unet/utils/trainer.py", line 82, in fit
    self._train_one_epoch(train_loader, epoch)
  File "/home/seg/bbunet/backbones_unet/utils/trainer.py", line 111, in _train_one_epoch
    preds = self.model(images)
  File "/home/miniconda3/envs/pt/lib/python3.9/site-packages/torch/nn/modules/module.py", line 1501, in _call_impl
    return forward_call(*args, **kwargs)
  File "/home/seg/bbunet/backbones_unet/model/unet.py", line 174, in forward
    x = self.decoder(x)
  File "/home/miniconda3/envs/pt/lib/python3.9/site-packages/torch/nn/modules/module.py", line 1501, in _call_impl
    return forward_call(*args, **kwargs)
  File "/home/seg/bbunet/backbones_unet/model/unet.py", line 334, in forward
    x = b(x, skip)
  File "/home/miniconda3/envs/pt/lib/python3.9/site-packages/torch/nn/modules/module.py", line 1501, in _call_impl
    return forward_call(*args, **kwargs)
  File "/home/seg/bbunet/backbones_unet/model/unet.py", line 276, in forward
    x = torch.cat([x, skip], dim=1)
RuntimeError: Sizes of tensors must match except in dimension 1. Expected size 4 but got size 3 for tensor number 1 in the list.

I tried with and without pretrained model but got the same error each time.

mberkay0 commented 1 year ago

Greetings @bc-bytes, you have passed the pretty low image size (64x48) for the encoder. Please consider experimenting with input dimensions of 224x224 or 256x256. Additionally, make sure that your data format follows the structure of [batch_size, channel, height, width].

bc-bytes commented 1 year ago

I need to train the models using patches of full images, which are 64 x 48 pixels.

mberkay0 commented 1 year ago

Sorry! To meet the desired dimensions, you will need to change the architecture of the network. A size of 64x48 is small. Developing a segmentation model with this size may require designing a simple convolutional neural network with fewer layers. You may need to base it on UNet but with significantly smaller layers. You will need to create your UNet model for this task.