Arbitrary input size and einops error

andodet commented 6 months ago

First of all thanks a lot for pushing this repository :raised_hands:.

I am having troubles in processing inputs of arbitrary size: when processing an image of size [1, 3, 864, 1280] the model throws the following error:

einops.EinopsError:  Error while processing rearrange-reduction pattern "b c (hg h) (wg w) -> (hg wg b) c h w".
 Input tensor shape: torch.Size([1, 128, 27, 40]). Additional info: {'hg': 2, 'wg': 2}.
 Shape mismatch, can't divide axis of length 27 in chunks of 2

Which it seems is caused by this line: https://github.com/qianyu-dlut/MVANet/blob/ff270a6682c9b5bf3ff73c588b0ef0291de49fed/model/MVANet.py#L319

I've noticed in predict.py all inputs are resized to 1024x1204, I assume exactly for this reason. Is resizing inputs to a standard size the correct strategy here?

qianyu-dlut commented 5 months ago

Hi! Yes, we trained the model with an input size of 1024x1024 and tested it with the same dimensions. If you use an arbitrary input size, it's possible to get odd sizes in deeper layers of the network, which could cause errors when dividing into patches.

andodet commented 5 months ago

Thanks a lot for the reply, much appreciated :ok_hand:!

qianyu-dlut / MVANet

Arbitrary input size and einops error #4