Closed semin-park closed 5 years ago
Hi thanks for your interest first.
Yes there are some misinterpretations of the code. During testing, the current code supports the width in this list only.
The code if ... else ... mainly supports training. Thus it does not mean sharing BN statistics among all other widths.
Simply put, if you want to test any width configuration, you should add it to width list first in the config file.
Aha, thank you for your answer. I have a couple follow-up questions just to make it absolutely clear.
If BN is used, US-Net is trainable at any arbitrary widths, but it's testable only within the predefined width list right?
If BN is NOT used, US-Net is both trainable and testable at any arbitrary widths, is this correct?
Again, thank you very much :) It was a joy reading your papers!
No. USNet is both trainable and testable at arbitrary widths in both cases. I would encourage to have a run of USNets by using our released weights and setting different width_mult_list in config file, and see which code lines are being executed.
Please also read the code in main function.
I think I didn't make myself very clear about my use of the word "arbitrary". What I meant was that I can't sample a random integer (e.g., np.random.randint(k0, N)
) at runtime and test on that number of channels.
If I want to test on an arbitrary width, shouldn't I first define that width in the .yaml file? At least I thought that's what you meant when you said
During testing, the current code supports the width in this list only. ... Simply put, if you want to test any width configuration, you should add it to width list first in the config file.
Sorry to bug you again and I'll make sure I give the code a run when I get to work :)
@SPark9625 Hey, no need to sorry. If you have question, just ask here. :)
You can sample any width_mult and do the test by:
width_mult_list: [${width_random}]
@SPark9625 We just updated our code to beta version, which supports BN calibration now. Please checkout.
First of all, thank you for your wonderful work!
I have a question regarding the
USBatchNorm
operation.I thought that sharing batchnorm between different channel widths was the main reason for the test performance deterioration as shown in your slimmable network paper (and thus the introduction of switchable BN).
But in your
USBatchNorm2d
implementation, it seems like you're privatizing BNs only for the widths defined in thewidth_mult_list
and share for the rest. In the code snippet below, you are using privatized BNif self.width_mult in FLAGS.width_mult_list:
(L179), but if not, you're usingself.running_mean
andself.running_var
(L193-194) which I'm assuming is shared between all the other widths.https://github.com/JiahuiYu/slimmable_networks/blob/21abdd278efc4a3a548aa633e887dd0f81d9cf3f/models/slimmable_ops.py#L174-L200
I'm curious if I'm misinterpreting your code or if you are indeed sharing BNs for widths not defined in the list.