mhamilton723 / FeatUp

Official code for "FeatUp: A Model-Agnostic Frameworkfor Features at Any Resolution" ICLR 2024
MIT License
1.33k stars 76 forks source link

How to use FeatUp on my model #28

Open 1chenchen22 opened 5 months ago

1chenchen22 commented 5 months ago

I am doing a facial expression recognition task and want to add this module into my model: self_featup = JBULearnedRange(guidance_dim=3, feat_dim=out_c, key_dim=32) Where do you think this module should be added, or how to set the guidance_dim and key_dim parameters for this module, which is part of my model

(features): Sequential( (0): Conv_block( (conv): Conv2d(3, 64, kernel_size=(3, 3), stride=(2, 2), padding=(1, 1), bias=False) (bn): BatchNorm2d(64, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True) (prelu): PReLU(num_parameters=64) ) (1): Conv_block( (conv): Conv2d(64, 64, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), groups=64, bias=False) (bn): BatchNorm2d(64, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True) (prelu): PReLU(num_parameters=64) ) (2): Mix_Depth_Wise( (conv): Conv_block( (conv): Conv2d(64, 128, kernel_size=(1, 1), stride=(1, 1), bias=False) (bn): BatchNorm2d(128, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True) (prelu): PReLU(num_parameters=128) ) (conv_dw): MDConv( (mixed_depthwise_conv): ModuleList( (0): Conv2d(64, 64, kernel_size=(3, 3), stride=(2, 2), padding=(1, 1), groups=64, bias=False) (1): Conv2d(32, 32, kernel_size=(5, 5), stride=(2, 2), padding=(2, 2), groups=32, bias=False) (2): Conv2d(32, 32, kernel_size=(7, 7), stride=(2, 2), padding=(3, 3), groups=32, bias=False) ) (bn): BatchNorm2d(128, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True) (prelu): PReLU(num_parameters=128) ) (CA): CoordAtt( (pool_h): AdaptiveAvgPool2d(output_size=(None, 1)) (pool_w): AdaptiveAvgPool2d(output_size=(1, None)) (conv1): Conv2d(128, 8, kernel_size=(1, 1), stride=(1, 1)) (bn1): BatchNorm2d(8, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True) (conv2): Conv2d(8, 128, kernel_size=(1, 1), stride=(1, 1)) (conv3): Conv2d(8, 128, kernel_size=(1, 1), stride=(1, 1)) (relu): h_swish( (sigmoid): h_sigmoid( (relu): ReLU6(inplace=True) ) ) ) (project): Linear_block( (conv): Conv2d(128, 64, kernel_size=(1, 1), stride=(1, 1), bias=False) (bn): BatchNorm2d(64, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True) ) ) (3): Mix_Residual( (model): Sequential( (0): Mix_Depth_Wise( (conv): Conv_block( (conv): Conv2d(64, 128, kernel_size=(1, 1), stride=(1, 1), bias=False) (bn): BatchNorm2d(128, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True) (prelu): PReLU(num_parameters=128) ) (conv_dw): MDConv( (mixed_depthwise_conv): ModuleList( (0): Conv2d(96, 96, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), groups=96, bias=False) (1): Conv2d(32, 32, kernel_size=(5, 5), stride=(1, 1), padding=(2, 2), groups=32, bias=False) ) (bn): BatchNorm2d(128, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True) (prelu): PReLU(num_parameters=128) ) (CA): CoordAtt( (pool_h): AdaptiveAvgPool2d(output_size=(None, 1)) (pool_w): AdaptiveAvgPool2d(output_size=(1, None)) (conv1): Conv2d(128, 8, kernel_size=(1, 1), stride=(1, 1)) (bn1): BatchNorm2d(8, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True) (conv2): Conv2d(8, 128, kernel_size=(1, 1), stride=(1, 1)) (conv3): Conv2d(8, 128, kernel_size=(1, 1), stride=(1, 1)) (relu): h_swish( (sigmoid): h_sigmoid( (relu): ReLU6(inplace=True) ) ) ) (project): Linear_block( (conv): Conv2d(128, 64, kernel_size=(1, 1), stride=(1, 1), bias=False) (bn): BatchNorm2d(64, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True) ) ) (1): Mix_Depth_Wise( (conv): Conv_block( (conv): Conv2d(64, 128, kernel_size=(1, 1), stride=(1, 1), bias=False) (bn): BatchNorm2d(128, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True) (prelu): PReLU(num_parameters=128) ) (conv_dw): MDConv( (mixed_depthwise_conv): ModuleList( (0): Conv2d(96, 96, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), groups=96, bias=False) (1): Conv2d(32, 32, kernel_size=(5, 5), stride=(1, 1), padding=(2, 2), groups=32, bias=False) ) (bn): BatchNorm2d(128, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True) (prelu): PReLU(num_parameters=128) ) (CA): CoordAtt( (pool_h): AdaptiveAvgPool2d(output_size=(None, 1)) (pool_w): AdaptiveAvgPool2d(output_size=(1, None)) (conv1): Conv2d(128, 8, kernel_size=(1, 1), stride=(1, 1)) (bn1): BatchNorm2d(8, eps=1e-05,

1chenchen22 commented 5 months ago

Thank you and look forward to hearing from you

1chenchen22 commented 5 months ago

> class Mix_Depth_Wise(Module):

def __init__(self, in_c, out_c, residual = False, kernel=(3, 3), stride=(2, 2), padding=(1, 1), groups=1, kernel_size=[3 ,5 ,7], split_out_channels=[64 ,32 ,32]):
    super(Mix_Depth_Wise, self).__init__()
    self.conv = Conv_block(in_c, out_c=groups, kernel=(1, 1), padding=(0, 0), stride=(1, 1))
    self.conv_dw = MDConv(channels=groups, kernel_size=kernel_size, split_out_channels=split_out_channels, stride=stride)

    self.CA = CoordAtt(groups, groups)
    self.featup = JBULearnedRange(guidance_dim=3, feat_dim=out_c, key_dim=32)
  #  self.featup = JBULearnedRange(guidance_dim=in_c, feat_dim=out_c, key_dim=out_c)  # 放置在CoordAtt之后
    self.project = Linear_block(groups, out_c, kernel=(1, 1), padding=(0, 0), stride=(1, 1))
    self.residual = residual

class Mix_Residual(Module):

def init(self, c, num_block, groups, kernel=(3, 3), stride=(1, 1), padding=(1, 1), kernel_size=[3 ,5], split_out_channels=[64 ,64]): super(MixResidual, self).init() modules = [] for in range(num_block): modules.append \ (Mix_Depth_Wise(c, c, residual=True, kernel=kernel, padding=padding, stride=stride, groups=groups, kernel_size=kernel_size, split_out_channels=split_out_channels )) self.featup = JBULearnedRange(guidance_dim=3, feat_dim=c, key_dim=32) self.model = Sequential(*modules)

    class CoordAtt(nn.Module):
def __init__(self, inp, oup, groups=32):
    super(CoordAtt, self).__init__()
    self.pool_h = nn.AdaptiveAvgPool2d((None, 1))
    self.pool_w = nn.AdaptiveAvgPool2d((1, None))
    self.featup = JBULearnedRange(guidance_dim=3, feat_dim=inp, key_dim=32)
    mip = max(8, inp // groups)

    self.conv1 = nn.Conv2d(inp, mip, kernel_size=1, stride=1, padding=0)
    self.bn1 = nn.BatchNorm2d(mip)
    self.conv2 = nn.Conv2d(mip, oup, kernel_size=1, stride=1, padding=0)
    self.conv3 = nn.Conv2d(mip, oup, kernel_size=1, stride=1, padding=0)
    self.relu = h_swish()

你好,我在这三个类里加入了,但是效果并没有变得很好

mhamilton723 commented 5 months ago

In the next few weeks we will be building out some samples for how to use FeatUp with custom models and images. Thanks for your patience

1chenchen22 commented 5 months ago

Ok, looking forward to your examples.I'm a novice, and I don't know much about it. As shown in the image below, I added featup to each Mix_Depth_Wise and Mix_Residual. The specific code for both of their classes is in my comments above.class MixedFeatureNet is a feature extraction module

I removed the attention model from the above model and added multiple featup into the model. It is effective, but the effect is not very significant.An increase of about 0.004 percentage points image

Anyway, thanks for your answer