Why setting `hid_exp_ratio=4` in stem and downsample layer?

Congrats for your excellent work!

As shown in your code, ratio=4 in stem and downsample layer, Is this setting for aligning with mlp_ratio in MLLABlock? Or any tricks here?

class Stem(nn.Module):
    ...
    self.conv3 = nn.Sequential(
        ConvLayer(embed_dim // 2, embed_dim * 4, kernel_size=3, stride=2, padding=1, bias=False),
        ConvLayer(embed_dim * 4, embed_dim, kernel_size=1, bias=False, act_func=None)
    )

class PatchMerging(nn.Module):
    def __init__(self, input_resolution, dim, ratio=4.0):
        super().__init__()
        self.input_resolution = input_resolution
        self.dim = dim
        in_channels = dim
        out_channels = 2 * dim
        self.conv = nn.Sequential(
            ConvLayer(in_channels, int(out_channels * ratio), kernel_size=1, norm=None),
            ConvLayer(int(out_channels * ratio), int(out_channels * ratio), kernel_size=3, stride=2, padding=1, groups=int(out_channels * ratio), norm=None),
            ConvLayer(int(out_channels * ratio), out_channels, kernel_size=1, act_func=None)
        )

LeapLabTHU / MLLA

Why setting `hid_exp_ratio=4` in stem and downsample layer? #27