google-research / deeplab2

DeepLab2 is a TensorFlow library for deep labeling, aiming to provide a unified and state-of-the-art TensorFlow codebase for dense pixel labeling tasks.
Apache License 2.0
998 stars 157 forks source link

Wrong predict results of MOAT-4 using MOAT ImageNet-pretrained checkpoints #144

Open wyw1993121 opened 1 year ago

wyw1993121 commented 1 year ago

Thanks for sharing of MOAT-4 model. I tried to use MOAT-4 model with the following pretrained weight to evaluate ImageNet-1k validation dataset:

MOAT-4 (initial_checkpoint)

However, almost all predicted results are not correct and I don't know why it happens. I would appreciate your insights on the following questions:

  1. Is the pretrained weight I downloaded can be used to evaluate ImageNet-1k validation dataset directly?
  2. If the answer of Q1 is yes, why I got almost all wrong predicted results?

I would appreciate your help. @aquariusjay

Chenglin-Yang commented 1 year ago

Thanks for your interest!

We have just updated the checkpoints, adding the moving mean and moving variance for the BN layers in models. Now they are ready for direct evaluation on the ImageNet-1k validation dataset.

We have tested on our end, and the scores of all the models can be reproduced. Please also refer to the doc for test settings, including input resolution and input normalization.

wyw1993121 commented 1 year ago

Thanks for your interest!

We have just updated the checkpoints, adding the moving mean and moving variance for the BN layers in models. Now they are ready for direct evaluation on the ImageNet-1k validation dataset.

We have tested on our end, and the scores of all the models can be reproduced. Please also refer to the doc for test settings, including input resolution and input normalization.

Thanks for your reply and checkpoints updating!

I tried to evaluate ImageNet-1k validation dataset with new pretrained weight: MOAT-4 (initial_checkpoint) However, I evaluated model after each modification listed below and I didn't get expected results:

  1. Normalizing the RGB image by the mean 127.5 and standard deviation 127.5. img = tf.math.subtract(img,127.5) img = tf.math.divide(img,127.5)

  2. Adding code for loading both moat.trainable_variables and moat.non_trainable_variables in 'moat.py' model_var_name = sorted([var.name for var in moat.trainable_variables]) model_var_name += sorted([var.name for var in moat.non_trainable_variables]) ckpt_var_name = list(sorted(variable_to_shape_map.keys())) # This for loop ensures all moat variables can be found in the checkpoint. model_vars = moat.trainable_variables + moat.non_trainable_variables for var in model_vars: name_to_find = var.name

  3. Also I tried to modify the survival probability of drop path to 1.0 in 'moat.py': elif name in ['moat4_finetune_512_22k', 'moat4_finetune_512_no_pe_22k']: config = copy.deepcopy(moat4_config) config.survival_prob = 1.0

    Also, here is my configuration of the model.

    moat = moat_lib.get_model( 'moat4_finetune_512_22k', input_shape=(512, 512, 3), window_size=[None, None, [height//16, width//16], [height//32, width//32]], override_config=override_config, pretrained_weights_path='....../moat4_imagenet_22k_and_1k_512_w_position_embedding/model-ckpt-0', global_attention_at_end_of_moat_stage=True, use_checkpointing_for_attention=True, )

    override_config = dict( build_classification_head_with_class_num=1_000)

I would be grateful if you could tell me what mistake I made. I am looking forward to you reply!

edwardyehuang commented 1 year ago

Same issue. moving_mean and moving_variance are not loaded in _load_moat_pretrained_checkpoint

Chenglin-Yang commented 1 year ago

I have three suggestions that may help you:

  1. For loading moving mean and variance, you do not need to add 'moat.non_trainable_variables', but can change 'moat.non_trainable_variables' to 'moat.variables'.
  2. For data preprocessing, please refer to the EfficientNetV2's preprocessing: https://github.com/google/automl/blob/master/efficientnetv2/preprocessing.py#L58
  3. Make sure the weights of all layers are correctly loaded.

Also, please note that the checkpoints are to serve the downstream tasks.

Etty-Cohen commented 1 year ago

@Chenglin-Yang @wyw1993121 @edwardyehuang , Can you load an example of code to finetuning the MOAT model?