dgyoo / pa3

Recent image representation as PA3 of the computer vision class.
7 stars 0 forks source link

question about multi-scale dense activations #11

Open 20155004 opened 8 years ago

20155004 commented 8 years ago

what is the data format of imSize In exMultiScaleDenseActivation.m ??

ex) rid2tlbr: [ 4 x numRegion, matrix]

Dong-JinKim commented 8 years ago

i am 20153080

it depends on how do you want to utilize imSize in encodefisher.m

if you want to make it simple, [1 *2 ] vector that contain x and y of only whole image is enough.

or imSize : [2 * numRegion, matrix](size of all regions) is also ok.

the important thing is how are you going to use imSize in further functions.

20155004 commented 8 years ago

Thank you.

20155004 commented 8 years ago

i have a another question. given input such as patchSide, stride, numScale are all integer value. right? patchSide = 224, numScale = 6??

Dong-JinKim commented 8 years ago

again 20153080.

numScale is 6 as default, and it "must" be an integer.

patchSide and stride are the value that you have to obtain in getNetProperties(), and these are usually integers

20155004 commented 8 years ago

thank you again

but, i cannot fully understand in getNetProperties(). (use vl_simplenn() if you determine patch side and stride by an empirical way)

we use 224x224x3 image as a input for pretrained cnn(vl_simplenn). so we extract a number of 224x224x3 images for each 6 scale images and use them input for pretrained cnn(vl_simplenn).

So, my question is how to use vl_simplenn() to detemine patch side and stride, and is it necessary? we can just extract 224x224x3 image using arbitrary stride.