I notice that there is a subtle problem in the initialization part of SDI layer.
Using the code self.convs = nn.ModuleList([nn.Conv2d(channel, channel, kernel_size=3, stride=1, padding=1)] * 4) in Python results in creating a list with four references to the same nn.Conv2d instance because of the list multiplication ([conv] * 4).
This means that instead of having four independent nn.Conv2d instances, all four entries in the module list actually point to the same object and share the same set of weights and biases.
I notice that there is a subtle problem in the initialization part of SDI layer.
Using the code
self.convs = nn.ModuleList([nn.Conv2d(channel, channel, kernel_size=3, stride=1, padding=1)] * 4)
in Python results in creating a list with four references to the samenn.Conv2d
instance because of the list multiplication ([conv] * 4
).This means that instead of having four independent
nn.Conv2d
instances, all four entries in the module list actually point to the same object and share the same set of weights and biases.And the same problem can be also found in
self.seg_outs = nn.ModuleList([nn.Conv2d(channel, n_classes, 1, 1)] * 4)
.The correct approach is to use a list comprehension to create separate
nn.Conv2d
instances: