IanTaehoonYoo / semantic-segmentation-pytorch

Pytorch implementation of FCN, UNet, PSPNet, and various encoder models.
MIT License
86 stars 20 forks source link

Scaling Error when training using different ratio size of image in batch_size > 1 #8

Open leocd91 opened 4 years ago

leocd91 commented 4 years ago

Error in Trainer.train() RuntimeError: stack expects each tensor to be equal size, but got [3, 200, 257] at entry 0 and [3, 200, 341] at entry 2 When using set of image that contain different kind of ratio (for example 4:3 with 16:9).

Running in batch_size = 1 is fine. But larger than that, make this error.

Sorry for troubling you haha. It's not really a bug I guess, I can solve it by resizing the train image myself, just letting you know if you want to fix it.

IanTaehoonYoo commented 4 years ago

Oh... I guess the rescale function is not working perfectly... I will fix it. Thank you for letting me know.

By the way, you used image size(200, 200) using PSPnet. I don't know your dataset, but PSPnet needs a bigger size than other networks because the network has a pyramid structure. There is some chance to vanish feature. If you don't know about that, test it :)

leocd91 commented 4 years ago

By the way, you used image size(200, 200) using PSPnet. I don't know your dataset, but PSPnet needs a bigger size than other networks because the network has a pyramid structure.

Ah thank you! I guess that's why it's not working well in some of my cases with small pixel label (ie. detecting crack), but doing well in some case with large label area (road, house, etc). I was just using the default param (scale width to 200). How about the FCN and UNET? Got another tips?

IanTaehoonYoo commented 4 years ago

Well, FCN is a simple network and requires lower GPU resources than other networks. You can use this network when you have not good enough hardware like a mobile phone. UNet is focused on finding a small feature so, this model often uses in the medical image like a cells image. So.. here are my tips.

  1. If you want to label a large and small area, consider using PSPnet.(recommend having upper (500x500) size)
  2. If you want to label a small area and detail, consider using UNet.
  3. If you want light results, consider using FCN. But it has not high accuracy.
  4. The input size is important. You should think about what you want to label the smallest area. And then, change sizes and test training.

I hope it can help you.

Thanks.