Is the image input of depth network fixed?

yzfzzz commented 1 year ago

Is the image input of depth network fixed? What happens if I use the pre-training weights you provided and input the image size as 1024*512?

ClementPinard commented 1 year ago

Indeed, it is fixed, to both a particular image size and a particular focal length.

If those are not meet, the network's result will be unpredictable.

However, during the training there was some augmentation where the image focal length would change a little, see here : https://github.com/ClementPinard/SfmLearner-Pytorch/blob/master/custom_transforms.py#L62

Note that image size never changes though

So even though the network itself is fully convolutional and thus would not crash when looking at a bigger image, the details in the image will not be the right size and thus the network will have a hard time getting depth right.

Even worse, I did a study where I would simply flip the image vertically (so the sky is at the bottom of the image) and the depth prediction was very bad too.

In conclusion, your best bet is to try to get your image to the training size (much smaller) or do a retraining of the network for bigger images, using the data module of this repo.

Hope it helped,

Clément

yzfzzz commented 1 year ago

OK, thanks!

ClementPinard / SfmLearner-Pytorch

Is the image input of depth network fixed? #149