Thank you for this interesting work! I noticed that with the ImageNet task you only apply gating to the fully-connected part of the network and leave the convolutional base unaltered. But most semantic segmentation models are fully-convolutional. Between the parameter sharing and their locally connected nature, I could see applying gating to convolutional layers being a bit tricky, so I was wondering if you've looked at this at all?
Oh, and you seem to go with 80% for the gating parameter - is this a finicky hyper-parameter or are results robust to this?
NOTE: I have class during the presentation and cannot attend :(
Thank you for this interesting work! I noticed that with the ImageNet task you only apply gating to the fully-connected part of the network and leave the convolutional base unaltered. But most semantic segmentation models are fully-convolutional. Between the parameter sharing and their locally connected nature, I could see applying gating to convolutional layers being a bit tricky, so I was wondering if you've looked at this at all?
Oh, and you seem to go with 80% for the gating parameter - is this a finicky hyper-parameter or are results robust to this?
NOTE: I have class during the presentation and cannot attend :(