evaluate with retrained model, but bug: Some weights of the model checkpoint at /checkpoint-22000 were not used when initializing T5EncoderModel:

qiuwenbogdut commented 1 year ago

Thanks for open sourcing this model! but now get an error: i cannot evaluate with retrained model.

train script:

train.py --model_name_or_path "/sentence-transformers/gtr-t5-large" --output_dir "train_output" --cache_dir " medi-data" --max_source_length 512 --num_train_epochs 2 --save_steps 500 --cl_temperature 0.01 --warmup_ratio 0.1 --learning_rate 2e-5 --per_device_train_batch_size 4 --gradient_accumulation_steps 2 --preprocessing_num_workers 20 --dataloader_num_workers 50

evaluate script:

import logging
import argparse
from mteb import MTEB
from InstructorEmbedding import INSTRUCTOR

if __name__ == '__main__':
    logging.basicConfig(level=logging.INFO)
    parser = argparse.ArgumentParser()
    parser.add_argument('--model_name', default="/checkpoint-22000",type=str)
    parser.add_argument('--output_dir', default="/outputs",type=str)
    parser.add_argument('--task_name', default="ArguAna",type=str)
    parser.add_argument('--cache_dir', default=None,type=str)
    parser.add_argument('--result_file', default="results",type=str)
    parser.add_argument('--prompt', default=None,type=str)
    parser.add_argument('--split', default='test',type=str)
    parser.add_argument('--batch_size', default=128,type=int)
    args = parser.parse_args()

    if not args.result_file.endswith('.txt') and not os.path.isdir(args.result_file):
        os.makedirs(args.result_file,exist_ok=True)

    # from tqdm import tqdm
    # from functools import partialmethod
    #
    # tqdm.__init__ = partialmethod(tqdm.__init__, disable=True)
    model = INSTRUCTOR(model_name_or_path= args.model_name,cache_folder=args.cache_dir)

    evaluation = MTEB(tasks=[args.task_name],task_langs=["en"])
    evaluation.run(model, output_folder=args.output_dir, eval_splits=[args.split],args=args,)

    print("--DONE--")

output

INFO:sentence_transformers.SentenceTransformer:Load pretrained SentenceTransformer: /container_data/qiuwenbo/instructor-embedding/download_weight/self-train-model/checkpoint-22000 WARNING:sentence_transformers.SentenceTransformer:No sentence-transformers model found with name /container_data/qiuwenbo/instructor-embedding/download_weight/self-train-model/checkpoint-22000. Creating a new one with MEAN pooling. Some weights of the model checkpoint at /container_data/qiuwenbo/instructor-embedding/download_weight/self-train-model/checkpoint-22000 were not used when initializing T5EncoderModel: ['0.auto_model.encoder.block.5.layer.1.DenseReluDense.wi.weight', '0.auto_model.encoder.block.1.layer.0.layer_norm.weight', '0.auto_model.encoder.block.23.layer.0.SelfAttention.v.weight', '0.auto_model.encoder.block.17.layer.0.SelfAttention.q.weight', '0.auto_model.encoder.block.12.layer.1.DenseReluDense.wo.weight',

hongjin-su commented 1 year ago

Could you try to compare the difference of state dict between INSTRUCTOR and your re-trained model?

qiuwenbogdut commented 1 year ago

for example:
INSTRUCTOR :encoder.block.1.layer.0.layer_norm.weight trained_by_myself: 0.auto_model.encoder.block.1.layer.0.layer_norm.weight

i try and solve the problem

import torch

self_weight= torch.load('pytorch_model.bin')

for key in list(self_weight.keys()):
    if key.startswith('0.auto_model.'):
        self_weight[key[len('0.auto_model.'):]] = self_weight[key]
        del self_weight[key]

torch.save(self_weight,'pytorch_model.bin')

hongjin-su commented 1 year ago

If you have removed the prefix 0.auto_model., why does the unused weight key still contain it?

qiuwenbogdut commented 1 year ago

If you have removed the prefix 0.auto_model., why does the unused weight key still contain it?

when remove the prefix 0.auto_model. the weight is available for evaluating with retrained model

hongjin-su commented 11 months ago

Feel free to re-open the issue if you have any further questions or comments!

xlang-ai / instructor-embedding