Closed lankuohsing closed 7 months ago
Hi, sorry for the confusion. The code in models.py
(line 262, 263) only affects the validation process during training as shown in trainer.py
. To make sure mlp transformation is not applied using evaluation.py
, you should set pooler to 'cls_before_pooler' in the evaluation script as opposed to 'cls' in the training script.
Hi, sorry for the confusion. The code in
models.py
(line 262, 263) only affects the validation process during training as shown intrainer.py
. To make sure mlp transformation is not applied usingevaluation.py
, you should set pooler to 'cls_before_pooler' in the evaluation script as opposed to 'cls' in the training script.
thanks!
I have noticed this sugsestion in README: --mlp_only_train: We have found that for unsupervised SimCSE, it works better to train the model with MLP layer but test the model without it. You should use this argument when training unsupervised SimCSE models.
Given pooler_type=='cls' and mlp_only_train==True, the embedding for testing during unsupervised training will not include the mlp transformation as indicated by the code in models.py(line 262, 263):
However, if I test my model(saved after unsupervised training and converted to huggingface checkpoint by simcse_to_huggingface.py) by using evaluation.py, the embedding will include mlp transformation (given pooler_type=='cls'), as indicated by the code in evaluation.py(line 119 to line 122) :
The pooler_output includes the MLP transformation because we have renamed 'mlp' to 'pooler' in simcse_to_huggingface.py):
Why is there a difference in using embeddings for testing during unsupervised training and for formal evaluation?