Closed lywang76 closed 2 years ago
In your method
@register_model def cae_large_patch16_384(pretrained=False, kwargs): model = VisionTransformer( img_size=384, patch_size=16, embed_dim=1024, depth=24, num_heads=16, mlp_ratio=4, qkv_bias=True, norm_layer=partial(nn.LayerNorm, eps=1e-6), kwargs) model.default_cfg = _cfg() return model
def _cfg(url='', kwargs): return { 'url': url, 'input_size': (3, 224, 224), 'pool_size': None, 'crop_pct': .9, 'interpolation': 'bicubic', 'mean': (0.5, 0.5, 0.5), 'std': (0.5, 0.5, 0.5), kwargs }
Therefore, if the input size is 384, your calling _cfg() will revise the input size to 224 again.
Could this be a potential problem?
Hi, this problem could be solved by passing this argument to the program: --input_size 384.
--input_size 384
In your method
@register_model def cae_large_patch16_384(pretrained=False, kwargs): model = VisionTransformer( img_size=384, patch_size=16, embed_dim=1024, depth=24, num_heads=16, mlp_ratio=4, qkv_bias=True, norm_layer=partial(nn.LayerNorm, eps=1e-6), kwargs) model.default_cfg = _cfg() return model
def _cfg(url='', kwargs): return { 'url': url, 'input_size': (3, 224, 224), 'pool_size': None, 'crop_pct': .9, 'interpolation': 'bicubic', 'mean': (0.5, 0.5, 0.5), 'std': (0.5, 0.5, 0.5), kwargs }
Therefore, if the input size is 384, your calling _cfg() will revise the input size to 224 again.
Could this be a potential problem?