IBM / terratorch

a Python toolkit for fine-tuning Geospatial Foundation Models (GFMs).
Apache License 2.0
151 stars 20 forks source link

Prithvi ViT behaviour when input size not divisible by patch size #172

Open CarlosGomes98 opened 1 month ago

CarlosGomes98 commented 1 month ago

Describe the issue With prithvi_vit, when the input spatial dimensions are not divisible by the patch size, part of the input is ignored. To Reproduce (optional, but appreciated) Steps to reproduce the behavior:

  1. Create a prithvi vit model
  2. Pass to it an input of size not divisible by the patch size
  3. No error is thrown

Expected behavior (optional) Either we should pad the input to a size divisible by the patch size, or throw an error

Joao-L-S-Almeida commented 1 week ago

That's strange. When using the test tests/test_backbones.py::test_vit_models_non_divisible_input (from the branch associated to this issue) I got:

>           raise EinopsError(message + "\n {}".format(e))
E           einops.EinopsError:  Error while processing rearrange-reduction pattern "b c (t tub) (h p) (w q) -> b (t h w) (tub p q c)".
E            Input tensor shape: torch.Size([1, 6, 4, 220, 230]). Additional info: {'tub': 1, 'p': 16, 'q': 16}.
E            Shape mismatch, can't divide axis of length 220 in chunks of 16

Isn't that the expected behaviour ?