Closed EmreTaha closed 4 months ago
If global average pool is used in main_linprobe.py , a layernorm is initiated and used. But the layernorm is not part of the model.head module, and its requires_grad flag is set to False, and not trained. https://github.com/facebookresearch/mae/blob/efb2a8062c206524e35e47d04501ed4f544c0ae8/main_linprobe.py#L225 Since MAE does not have a module pretrained, layernorm weights are random. Is it intentional?
If global average pool is used in main_linprobe.py , a layernorm is initiated and used. But the layernorm is not part of the model.head module, and its requires_grad flag is set to False, and not trained. https://github.com/facebookresearch/mae/blob/efb2a8062c206524e35e47d04501ed4f544c0ae8/main_linprobe.py#L225 Since MAE does not have a module pretrained, layernorm weights are random. Is it intentional?