truskovskiyk / nima.pytorch

NIMA: Neural IMage Assessment
MIT License
330 stars 79 forks source link

Using adaptive pooling to support different input image dimensions #25

Open nitinsurya opened 5 years ago

nitinsurya commented 5 years ago

Currently, the transformations resize any image to 224x224, due to which lot of image details could be lost. Maybe the model should be given a choice to keep the information important.

To do this, after the final layer of the base/pretrained model, an adaptive pooling layer would reset to the shape of input to the dense layer.

Although, this might need some fine tuning of the base model after the final fc layer is trained to improve the results further.

Mozen commented 4 years ago

@nitinsurya I use batch size 64, and if not resize image to 224x224, it will be error. Although I use adaptive pooling layer RuntimeError: invalid argument 0: Sizes of tensors must match except in dimension 0. Got 224 and 263 in dimension 2 at /opt/conda/conda-bld/pytorch_1532581333611/work/aten/src/TH/generic/THTensorMath.cpp:3616 How did you do it?

nitinsurya commented 4 years ago

The sizes within a batch should match, but across batches could be different. I wrote my own batchifier which basically seperates out the images into different aspect ratios, and within a set of images with the same aspect ratio, reshape all of them to same dimension and run the batch.