jlindsey15 / RAM

Recurrent Visual Attention Model
135 stars 52 forks source link

Why do pad_to_bounding_box in ram.py? #15

Open igo312 opened 6 years ago

igo312 commented 6 years ago

As the title said, I found there is a tf.image.pad_to_bounding_box to get a zero padding.However I did not found the process in the paper. What's that mean?Actually, I also want to know the meaning of max_radius Thank you.

qihongl commented 6 years ago

Thanks for the question!

Hmmm... I don't remember the detail, but I think zero-padding makes sense: We should allow RAM to look at the corner of the image. And max radius is probably the size of the glimpse.

jlindsey15 commented 6 years ago

To clarify further -- the glimpses are taken at three separate spatial scales ("zoom" levels), with radii separated by factors of 2. Max radius refers to the largest of these.

igo312 commented 6 years ago

@jlindsey15 the function of the max_radius is just to be input into the pad_to_bounding_box,In my mind it not actually check the scales's shape.

And I don't know why do that,It seems we have already got the translated image after convertTranslated().

######################################################################### Say a digression, do anyone know the cnn network can get the information about location? I think cnn only can learn the content about images.

I was 'hold on' after I ask the question in the stackoverflow :(