Closed superkevingit closed 3 years ago
Part of my confusion has been solved. The figure above is our settings. We have two network, top one we called controller, which is used to generate params for dynamic convolution, bottom one is the network we want NAS to find the best structure. We have three steps( as shown in the figure). First, we get the candidate conv structure from NAS decision. Second, based on the structure in this trial, bottom network tell the controller network the number of params that it should generate, and decide the channel number. Final, controller generate conv params for the other network.
There are two problems in my implementation in PyTorch before.
First one, how can I implement dynamic conv with API such as LayerChoice
?
So I rewrite the demo function for Dilated depthwise separable conv like this:
class DilConv(nn.Module):
"""
(Dilated) depthwise separable conv.
ReLU - (Dilated) depthwise separable - Pointwise - BN.
If dilation == 2, 3x3 conv => 5x5 receptive field, 5x5 conv => 9x9 receptive field.
"""
def __init__(self, C_in, C_out, kernel_size, stride, padding, dilation, affine=True, bias=False):
super().__init__()
self.c_in = C_in
self.c_out = C_out
self.kernel_size = kernel_size
self.bias = bias
self.stride = stride
self.padding = padding
self.dilation = dilation
self.activate = nn.ReLU()
self.bn = nn.BatchNorm2d(C_out, affine=affine)
def forward(self, x, params):
dw_weight_params, dw_bias_params, params = self.reshape_params(params, self.c_in, self.c_in, self.kernel_size, self.bias)
pw_weight_params, pw_bias_params, _ = self.reshape_params(params, self.c_in, self.c_out, 1, self.bias)
x = self.activate(x)
x = F.conv2d(x,
weight=dw_weight_params,
bias=dw_bias_params,
stride=self.stride,
padding=self.padding,
dilation=self.dilation,
groups=self.c_in)
x = F.conv2d(x,
weight=pw_weight_params,
bias=pw_bias_params)
return self.bn(x)
def reshape_params(self, params, in_channel, out_channel, filter_sz, bias):
# TODO
return params, None, params
The second one is how to tell the controller the NAS decision for dynamic conv in an elegant way, and I am working on it.
Hi. Thanks for asking. So can you show the implementation of your "controller" briefly? I'm not sure I have understood the idea shown in the picture. Actually, if you have read the implementation of "EnasMutator", you will see that controller is the one who made the decision and it doesn't needs to be "told" of the NAS decision.
I'm sorry I didn't make myself clear, maybe the name of the network conflicts here. "Controller" is an auxiliary network for the bottom one, in computer vision task for example, the bottom one uses image-level features and the "controller" use the temporal information, and we want to use dynamic conv to combine them.
@superkevingit thanks for reporting your issue. According to my understanding, there are three components: NAS algorithm (for generating kernel size, conv type), controller network (for generating conv weights for classification network), and classification network (receives the hyper-parameters from NAS algorithm and receives conv weights weights).
I have several questions:
Yes, you are right!
LayerChoice
, so the controller network should get the feedback of NAS algorithm from classification network in the implementation.I think there are several components/logics in your design and another several components in NNI. I propose a mapping here.
DynamicConvLayerChoice
)on_dynamic_conv_layer_choice_forward
, or in reset
).on_dynamic_conv_layer_choice_forward
).By design, mutator is just an "implementation" of the underlying computational logic of your layer choice. Related parameters (hypernet parameters) should be stored in layer choie itself. In another word:
You can surely implement NAS algorith and controller as separate components that belong to mutator, but it's your own choice.
@superkevingit I’m closing this issue as it has no updates from user for 3 months, please feel free to reopen if you are still seeing it an active issue.
@superkevingit I’m closing this issue as it has no updates from user for 3 months, please feel free to reopen if you are still seeing it an active issue.
Okay, Thanks!
What would you like to be added:
NAS algorithms API supports for dynamic CNN like
torch.nn.functional.conv2d()
, which the filter's weights are generated by other network dynamically.Why is this needed:
Be compatible with PyTorch APIs.
Without this feature, how does current nni work:
Currently, nni implemented APIs for static network definition.
Components that may involve changes:
I think, some new APIs that can return the final choice before actually define the conv should be added.
Brief description of your proposal if any:
I am wondering if I can use the static API to define the search space, and I will get the current choices of network structures, then I define dynamic convs based on the NAS choice, and calculate the gradients through the dynamic convs in
forward
function. Here, the search space defined in__init__
function is just for getting the structure references.