about modeling_finetune.py

In your method

@register_model def cae_large_patch16_384(pretrained=False, kwargs): model = VisionTransformer( img_size=384, patch_size=16, embed_dim=1024, depth=24, num_heads=16, mlp_ratio=4, qkv_bias=True, norm_layer=partial(nn.LayerNorm, eps=1e-6), kwargs) model.default_cfg = _cfg() return model

def _cfg(url='', kwargs): return { 'url': url, 'input_size': (3, 224, 224), 'pool_size': None, 'crop_pct': .9, 'interpolation': 'bicubic', 'mean': (0.5, 0.5, 0.5), 'std': (0.5, 0.5, 0.5), kwargs }

Therefore, if the input size is 384, your calling _cfg() will revise the input size to 224 again.

Could this be a potential problem?

lxtGH / CAE

about modeling_finetune.py #4