hyf015 / egocentric-gaze-prediction

Code for the paper "Predicting Gaze in Egocentric Video by Learning Task-dependent Attention Transition"
62 stars 18 forks source link

The number of convolution layer is not enough in models/model_SP.py #10

Closed kazucmpt closed 5 years ago

kazucmpt commented 5 years ago

You wrote, " The SP module is a set of 5 convolution layer groups following the inverse order of VGG16 while changing all max-pooling layers into upsampling layers." in your paper. In VGG16, the final convolution layer group has three layers. So you have to add one more layer in the first convolution layer group when you decode.

I mean that models/model_SP.py should be modified. You wrote

self.decoder = nn.Sequential( nn.Conv2d(512, 512, kernel_size=3, padding = 1), nn.ReLU(inplace=True), nn.Conv2d(512, 512, kernel_size=3, padding = 1), nn.ReLU(inplace=True), nn.Upsample(scale_factor=2),

But I guess it should be

self.decoder = nn.Sequential( nn.Conv2d(512, 512, kernel_size=3, padding = 1), nn.ReLU(inplace=True), nn.Conv2d(512, 512, kernel_size=3, padding = 1), nn.ReLU(inplace=True), nn.Conv2d(512, 512, kernel_size=3, padding = 1), nn.ReLU(inplace=True), nn.Upsample(scale_factor=2),

Thank you.

hyf015 commented 5 years ago

I think either is OK. It wouldn't infect the result too much.

kazucmpt commented 5 years ago

Why do not you modify it?

hyf015 commented 5 years ago

I'll modify it soon.