qianyu-dlut / MVANet

MIT License
60 stars 7 forks source link

Arbitrary input size and einops error #4

Closed andodet closed 1 month ago

andodet commented 2 months ago

First of all thanks a lot for pushing this repository :raised_hands:.

I am having troubles in processing inputs of arbitrary size: when processing an image of size [1, 3, 864, 1280] the model throws the following error:

einops.EinopsError:  Error while processing rearrange-reduction pattern "b c (hg h) (wg w) -> (hg wg b) c h w".
 Input tensor shape: torch.Size([1, 128, 27, 40]). Additional info: {'hg': 2, 'wg': 2}.
 Shape mismatch, can't divide axis of length 27 in chunks of 2

Which it seems is caused by this line: https://github.com/qianyu-dlut/MVANet/blob/ff270a6682c9b5bf3ff73c588b0ef0291de49fed/model/MVANet.py#L319

I've noticed in predict.py all inputs are resized to 1024x1204, I assume exactly for this reason. Is resizing inputs to a standard size the correct strategy here?

qianyu-dlut commented 1 month ago

Hi! Yes, we trained the model with an input size of 1024x1024 and tested it with the same dimensions. If you use an arbitrary input size, it's possible to get odd sizes in deeper layers of the network, which could cause errors when dividing into patches.

andodet commented 1 month ago

Thanks a lot for the reply, much appreciated :ok_hand:!