Prediction with small tiles

Lydorn / mapalignment

Aligning and Updating Cadaster Maps with Remote Sensing Images

Other

73 stars 13 forks source link

Prediction with small tiles #8

Closed cthnguyen closed 4 years ago

cthnguyen commented 4 years ago

Hello,

I am trying to predict using your github with the main.py. I noticed that it works great with a big geotiff image (for example 4000*4000 image), but once the size of the image decrease, there is a warning error (WARNING: image patch should have spatial shape (220, 220) instead of (152, 152). Adapt patch size accordingly) and the prediction is less precise.

Is there a minimum image size needed ? Or do I need to modify a downsampling function somewhere ?

Thanks a lot !

MathisMohand commented 4 years ago

Hi,

The neural network resizes the image to the size 220x220, so bigger images are not an issue, but I think you should at least have this size for your images

Lydorn commented 4 years ago

Hi,

There is indeed a minimum image size. The network was trained on 220x220px patches and so expects inputs of that size. But what you must also take into account is that the network is applied at different resolutions. So the input image is downsized by factors of 8, 4, 2 and 1. As such the input image should have at least size 220*8=1760px so that the network's input size is maintained to be 220px when downsizing with factor 8. If your image is smaller than that you can try padding it to be at least 1760px, I'm not sure how the network will perform though because it has never seen padding in training and that might throw it off...

cthnguyen commented 4 years ago

Hello, Thank you all for your answers. Actually, I found the answer, it also depends on the size of your pixel size. On the function downsample_data on multires_pipeline.py, the scale factor also depends on the pixel_size:

def downsample_data(image, metadata, polygons, factor, reference_pixel_size):
    corrected_factor = factor * reference_pixel_size / metadata["pixelsize"]
    scale = 1 / corrected_factor
    downsampled_image, downsampled_polygons = rescale_data(image, polygons, scale)
    return downsampled_image, downsampled_polygons

And I tried with smaller images, if the image is not at least of size 220*220 for a given resolution, it does not predict for that resolution. I didn't tried with padding though.

Lydorn commented 4 years ago

Ah yes you are right I forgot about the pixel size :-) You can then try changing the pixel size to avoid padding. The networks should be able to generalize to different pixel size, as long as they are not widely different. What is the image size you want to use and what is it's pixel size?

cthnguyen commented 4 years ago

I have image tiles 256*256 and my pixel size is 0.2. But most likely I need bigger images to be able to train at all resolution :)

Lydorn commented 4 years ago

Well it depends on the maximum misalignment you want your final model (which includes the multi-resolution approach with multiple networks to train) to handle. If your maximum misalignment is below 8px you can try training a single model with no downsampling. I'm not sure it would work for higher displacements but you can always try. Also since you are training the models you can change the input size of the models you train by changing the relevant parameters in the config.json file. For exemple when using downsampling factor 8, you can train a model with input size 32. Then with downsampling factor 4 use input size 64, etc. The different models in the multi-resolution approach do not need to have the same input size. I didn't test the inference code for that case but if for some reason it does not work it would be easily fixable.