princeton-nlp / SimCSE

[EMNLP 2021] SimCSE: Simple Contrastive Learning of Sentence Embeddings https://arxiv.org/abs/2104.08821
MIT License
3.33k stars 505 forks source link

training based on a local model #182

Closed DanHuang752 closed 2 years ago

DanHuang752 commented 2 years ago

Hi, Thanks for the great work! Is the training only support bert-based models or roberta-based models that are avaliable on Hugging face? Is it possible to train a local model? If yes, what’s the requirement of the local model, like it should be a subclass of PreTrainedModel? Thank you in advance. I would like to use the script run_sup_example.sh to train a local model, so I set model_name_or_path to the path of the local model. But I ran into error like this:

[INFO|tokenization_utils_base.py:1766] 2022-06-09 07:56:05,789 >> loading file https://huggingface.co/bert-base-uncased/resolve/main/vocab.txt from cache at /root/.cache/huggingface/transformers/45c3f7a79a80e1cf0a489e5c62b43f173c15db47864303a55d623bb3c96f72a5.d789d64ebfe299b0e416afc4a169632f903f693095b4629a7ea271d5a0cf2c99 [INFO|tokenization_utils_base.py:1766] 2022-06-09 07:56:05,790 >> loading file https://huggingface.co/bert-base-uncased/resolve/main/tokenizer.json from cache at /root/.cache/huggingface/transformers/534479488c54aeaf9c3406f647aa2ec13648c06771ffe269edabebd4c412da1d.7f2721073f19841be16f41b0a70b600ca6b880c8f3df6f3535cbc704371bdfa4 [INFO|modeling_utils.py:1025] 2022-06-09 07:56:05,807 >> loading weights file bert-sparsemax/pytorch_model.bin Traceback (most recent call last): File "/usr/local/lib/python3.7/dist-packages/transformers/modeling_utils.py", line 1038, in from_pretrained state_dict = torch.load(resolved_archive_file, map_location="cpu") File "/usr/local/lib/python3.7/dist-packages/torch/serialization.py", line 594, in load return _load(opened_zipfile, map_location, pickle_module, **pickle_load_args) File "/usr/local/lib/python3.7/dist-packages/torch/serialization.py", line 853, in _load result = unpickler.load() AttributeError: Can't get attribute 'MyEnsemble' on <module 'main' from 'train.py'>

During handling of the above exception, another exception occurred:

Traceback (most recent call last): File "train.py", line 585, in main() File "train.py", line 368, in main model_args=model_args File "/usr/local/lib/python3.7/dist-packages/transformers/modeling_utils.py", line 1041, in from_pretrained f"Unable to load weights from pytorch checkpoint file for '{pretrained_model_name_or_path}' " OSError: Unable to load weights from pytorch checkpoint file for 'bert-sparsemax' at 'bert-sparsemax/pytorch_model.bin'If you tried to load a PyTorch model from a TF 2.0 checkpoint, please set from_tf=True.

here is the script I used: python train.py \ --model_name_or_path bert-sparsemax \ --tokenizer_name bert-base-uncased \ --train_file /content/drive/MyDrive/SimCSE-main/c5_ep10.csv \ --output_dir /content/drive/MyDrive/SimCSE-main/result/c5_ep10_sparsemax \ --num_train_epochs 3 \ --per_device_train_batch_size 128 \ --learning_rate 5e-5 \ --max_seq_length 32 \ --evaluation_strategy steps \ --metric_for_best_model stsb_spearman \ --load_best_model_at_end \ --eval_steps 125 \ --pooler_type cls \ --overwrite_output_dir \ --temp 0.05 \ --do_train \ --do_eval \ --fp16 \ "$@"

gaotianyu1350 commented 2 years ago

Hi,

Do you mean using a local checkpoint? You can surely do that and it is supported by the transformers package. Just download your model and specify the path to the local model folder with --model_name_or_path.

DanHuang752 commented 2 years ago

Thank you!