JiahuiYu / slimmable_networks

Slimmable Networks, AutoSlim, and Beyond, ICLR 2019, and ICCV 2019
Other
914 stars 131 forks source link

US-Net BatchNorm #22

Closed semin-park closed 5 years ago

semin-park commented 5 years ago

First of all, thank you for your wonderful work!

I have a question regarding the USBatchNorm operation.

I thought that sharing batchnorm between different channel widths was the main reason for the test performance deterioration as shown in your slimmable network paper (and thus the introduction of switchable BN).

But in your USBatchNorm2d implementation, it seems like you're privatizing BNs only for the widths defined in the width_mult_list and share for the rest. In the code snippet below, you are using privatized BN if self.width_mult in FLAGS.width_mult_list:(L179), but if not, you're using self.running_mean and self.running_var(L193-194) which I'm assuming is shared between all the other widths.

https://github.com/JiahuiYu/slimmable_networks/blob/21abdd278efc4a3a548aa633e887dd0f81d9cf3f/models/slimmable_ops.py#L174-L200

I'm curious if I'm misinterpreting your code or if you are indeed sharing BNs for widths not defined in the list.

JiahuiYu commented 5 years ago

Hi thanks for your interest first.

Yes there are some misinterpretations of the code. During testing, the current code supports the width in this list only.

The code if ... else ... mainly supports training. Thus it does not mean sharing BN statistics among all other widths.

Simply put, if you want to test any width configuration, you should add it to width list first in the config file.

semin-park commented 5 years ago

Aha, thank you for your answer. I have a couple follow-up questions just to make it absolutely clear.

  1. If BN is used, US-Net is trainable at any arbitrary widths, but it's testable only within the predefined width list right?

  2. If BN is NOT used, US-Net is both trainable and testable at any arbitrary widths, is this correct?

Again, thank you very much :) It was a joy reading your papers!

JiahuiYu commented 5 years ago

No. USNet is both trainable and testable at arbitrary widths in both cases. I would encourage to have a run of USNets by using our released weights and setting different width_mult_list in config file, and see which code lines are being executed.

Please also read the code in main function.

semin-park commented 5 years ago

I think I didn't make myself very clear about my use of the word "arbitrary". What I meant was that I can't sample a random integer (e.g., np.random.randint(k0, N)) at runtime and test on that number of channels.

If I want to test on an arbitrary width, shouldn't I first define that width in the .yaml file? At least I thought that's what you meant when you said

During testing, the current code supports the width in this list only. ... Simply put, if you want to test any width configuration, you should add it to width list first in the config file.

Sorry to bug you again and I'll make sure I give the code a run when I get to work :)

JiahuiYu commented 5 years ago

@SPark9625 Hey, no need to sorry. If you have question, just ask here. :)

You can sample any width_mult and do the test by:

  1. width_random = np.random.uniform(0, 1)
  2. Put the width_random into the .yaml file width_mult_list: [${width_random}]
  3. Run our code. In our code, we basically do two things: calibration BN and then do testing.
JiahuiYu commented 5 years ago

@SPark9625 We just updated our code to beta version, which supports BN calibration now. Please checkout.