Closed haikuoyao closed 7 years ago
TOP 1.
@taineleau Thank you.
May I ask another question?
In paper, we compute the average (absolute) weight to conduct an experiment on Feature Reuse.
What is exactly the weight
?
And how do we compute it?
Thanks a lot.
@haikuoyao Let me try to answer your question. @liuzhuang13 Please correct me if I am wrong. Basically, the weight of a convolution layer is of shape (n_input_planes, n_output_plane, filter_width, filter_height)
. If we are looking at a specific layer l
, say l = 3
, this layer accepts features from layer 0, 1 and 2, which are concatenated as the input to layer 3.
Suppose we store the weight in w
(a python variable). (Note that, the number of channel of layer 0 is 24, and the growth rate k
is 12
).
Hence, the weight for the grid (s=0, l=3) is w[:24, :, :, :]
;
For (s=1, l=3), it's w[24:36, :, :, :]
;
For (s=2, l=3), it's w[36:48, :, :, :]
.
Thanks a lot @taineleau .
Sorry to ask another question.
How long did it take to classify one single image with densenet-161.t7
model by classify.lua
in you guys' experiment?
It takes around 2s here on GPU. Seems it's right. It shouldn't take so long, right?
oh... When I classified more than one image, it just took a little bit long on the first image and quite fast on rest images. Seems it's okay. Sorry to bother you.
@haikuoyao You're right. For the first batch, the chips need some more time to get prepared. I believe if you take a look at the data time
, it takes up most of the time when you forward the first batch.
@taineleau Thanks a lot. I'm gonna close this issue. :)
Hi there,
I try to reproduce experiment result.
Sorry, this might be a silly question. Is error rates top-1 or top-5 in this table?
Thanks.