facebookresearch / ToMe

A method to increase the speed and lower the memory footprint of existing vision transformers.
Other
931 stars 67 forks source link

AttributeError: 'ToMeBlock' object has no attribute 'drop_path' #5

Closed kos94ok closed 1 year ago

kos94ok commented 1 year ago

File "/tome/tome/patch/timm.py", line 35, in forward x = x + self.drop_path_rate(x_attn) File "/usr/local/lib/python3.7/dist-packages/torch/nn/modules/module.py", line 1208, in __getattr__ type(self).__name__, name)) AttributeError: 'ToMeBlock' object has no attribute 'drop_path_rate'

dbolya commented 1 year ago

/tome/patch/timm.py contains no mention of "drop_path_rate". Did you edit the code?

kos94ok commented 1 year ago

/tome/patch/timm.py contains no mention of "drop_path_rate". Did you edit the code?

File "/content/tome/tome/patch/timm.py", line 33, in forward x = x + self.drop_path(x_attn) File "/usr/local/lib/python3.7/dist-packages/torch/nn/modules/module.py", line 1208, in __getattr__ type(self).__name__, name)) AttributeError: 'ToMeBlock' object has no attribute 'drop_path'

dbolya commented 1 year ago

drop_path should exist in timm 0.4.12: https://github.com/rwightman/pytorch-image-models/blob/7096b52a613eefb4f6d8107366611c8983478b19/timm/models/vision_transformer.py#L207

Are you using the right version of timm (0.4.12) and passing in a timm model?

dbolya commented 1 year ago

Alternatively, you can replace drop_path with an identity since that's only necessary during training.

kos94ok commented 1 year ago

drop_path should exist in timm 0.4.12: https://github.com/rwightman/pytorch-image-models/blob/7096b52a613eefb4f6d8107366611c8983478b19/timm/models/vision_transformer.py#L207

Are you using the right version of timm (0.4.12) and passing in a timm model?

I use timm==0.6.11

My code:

...

class Encoder(VisionTransformer):

    def __init__(self, img_size=224, patch_size=16, in_chans=3, embed_dim=768, depth=12, num_heads=12, mlp_ratio=4.,
             qkv_bias=True, drop_rate=0., attn_drop_rate=0., drop_path_rate=0., embed_layer=PatchEmbed):
    super().__init__(img_size, patch_size, in_chans, embed_dim=embed_dim, depth=depth, num_heads=num_heads,
                     mlp_ratio=mlp_ratio, qkv_bias=qkv_bias, drop_rate=drop_rate, attn_drop_rate=attn_drop_rate,
                     drop_path_rate=drop_path_rate, embed_layer=embed_layer,
                     num_classes=0, global_pool='', class_token=False)  # these disable the classifier head

    def forward(self, x):
        # Return all tokens
        return self.forward_features(x)
...

self.encoder = Encoder(img_size, patch_size, embed_dim=embed_dim, depth=enc_depth, num_heads=enc_num_heads,
                           mlp_ratio=enc_mlp_ratio)
tome.patch.timm(self.encoder, prop_attn=False)
self.encoder.r = 16

...
dbolya commented 1 year ago

Ah, we don't yet support higher versions of timm so you'll have to install 0.4.12.

kos94ok commented 1 year ago

Ah, we don't yet support higher versions of timm so you'll have to install 0.4.12.

I think I solved it

timm==0.6.11

I replace (tome/tome/patch/timm.py)

x = x + self.drop_path(x_attn)
x = x + self.drop_path(self.mlp(self.norm2(x)))

to

x = x + self.drop_path1(x_attn)
x = x + self.drop_path1(self.mlp(self.norm2(x)))