Open HUSTHY opened 5 years ago
Hi @HUSTHY thanks for pointing this out. The LabelAccuracyEvaluator needs access to the Softmax loss model in order to compute the labels (for example, for the NLI task).
See this file how the LabelAccuracyEvaluator must be changed (you get the file if you check-out the v0.2.4 branch): https://github.com/UKPLab/sentence-transformers/commit/638d3703bfe7353b2a9e04bacbef6b81d4a7618c
If you want to use the LabelAccuracyEvaluator, your code must look like this (for example, on the NLI dataset):
logging.info("Read AllNLI train dataset")
train_data = SentencesDataset(nli_reader.get_examples('train.gz', 1000), model=model)
train_dataloader = DataLoader(train_data, shuffle=True, batch_size=batch_size)
train_loss = losses.SoftmaxLoss(model=model, sentence_embedding_dimension=model.get_sentence_embedding_dimension(), num_labels=train_num_labels)
logging.info("Read STSbenchmark dev dataset")
dev_data = SentencesDataset(examples=nli_reader.get_examples('dev.gz'), model=model)
dev_dataloader = DataLoader(dev_data, shuffle=False, batch_size=batch_size)
evaluator = LabelAccuracyEvaluator(dev_dataloader, softmax_model = train_loss)
Best Nils Reimers
黄洋
------------------ 原始邮件 ------------------ 发件人: "Nils Reimers"<notifications@github.com>; 发送时间: 2019年9月21日(星期六) 晚上11:35 收件人: "UKPLab/sentence-transformers"<sentence-transformers@noreply.github.com>; 抄送: "黄洋"<840499869@qq.com>; "Mention"<mention@noreply.github.com>; 主题: Re: [UKPLab/sentence-transformers] error in LabelAccuracyEvaluator.py (#27)
Hi @HUSTHY thanks for pointing this out. The LabelAccuracyEvaluator needs access to the Softmax loss model in order to compute the labels (for example, for the NLI task).
See this file how the LabelAccuracyEvaluator must be changed (you get the file if you check-out the v0.2.4 branch): 638d370
If you want to use the LabelAccuracyEvaluator, your code must look like this (for example, on the NLI dataset):
logging.info("Read AllNLI train dataset") train_data = SentencesDataset(nli_reader.get_examples('train.gz', 1000), model=model) train_dataloader = DataLoader(train_data, shuffle=True, batch_size=batch_size) train_loss = losses.SoftmaxLoss(model=model, sentence_embedding_dimension=model.get_sentence_embedding_dimension(), num_labels=train_num_labels) logging.info("Read STSbenchmark dev dataset") dev_data = SentencesDataset(examples=nli_reader.get_examples('dev.gz'), model=model) dev_dataloader = DataLoader(dev_data, shuffle=False, batch_size=batch_size) evaluator = LabelAccuracyEvaluator(dev_dataloader, softmax_model = train_loss)
Best
Nils Reimers
— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub, or mute the thread.
Hi @HUSTHY Note that this framework is not optimal for sentence pairwise classification. It uses a bi-encoder, i.e., sentence are mapped independently to sentence embeddings. For classification, the classifier would take these two embeddings and derive a label.
BERT, on the other side, uses a cross-encoder: Both sentences are present at input time and BERT can compare the two inputs to derive the labels. This gives much better classification results. The disadvantage of BERT cross encoder is, that you do not get sentence embeddings, which you need for example for clustering, semantic search etc.
If you do pairwise classification, like NLI, BERT would be the better choice.
I got it. Thanks for your suggestion.
I found a bug in LabelAccuracyEvaluator.py files in v0.2.4 branch. _, prediction = self.softmaxmodel(features, labels=None) It should fixed like this: , prediction = self.softmax_model.to(self.device)(features, labels=None)
And I have a question. When I use model.fit() and then use LabelAccuracyEvaluator, the accuracy is 0.915. However, when I and load the saved fine-tune model and just only use LabelAccuracyEvaluator to evaluate the same data , the accuracy is 0.51. Is there something wrong? My code is like this:
batch_size = 16 nli_reader = LCQMCDataReader('datasets/patentData') model_save_path='output/training_patent_sbert-Chinese-BERT-wwm2019-09-23_13-11-58_with_15K_Trains' word_embedding_model=models.BERT('output/training_patent_sbert-Chinese-BERT-wwm2019-09-23_14-50-35_with_15K_Trains/0_BERT')
pooling_model = models.Pooling(word_embedding_model.get_word_embedding_dimension(), pooling_mode_mean_tokens=True, pooling_mode_cls_token=False, pooling_mode_max_tokens=False) model = SentenceTransformer(modules=[word_embedding_model, pooling_model]) train_loss= losses.SoftmaxLoss(model=model, sentence_embedding_dimension=model.get_sentence_embedding_dimension(), num_labels=2)
test_data = SentencesDataset(examples=nli_reader.get_examples("test.csv"), model=model) test_dataloader = DataLoader(test_data, shuffle=False, batch_size=batch_size)
evaluator=LabelAccuracyEvaluator(test_dataloader,model_save_path) model.evaluate(evaluator)
Hi @HUSTHY the model saves only the layers that are responsible to produce sentence embeddings (which is the main purpose of this framework).
The SoftmaxLoss module is a softmax classifier with trainable weights. These weights are not stored on default. I.e., when you call train_loss = losses.SoftmaxLoss(...)
a new softmax classifier is initialized with random weights. This new softmax classifier produces no sensible labels, therefore you get a low accuracy when you load the model.
Solution: You would need to save the train_loss also to disc and load it then from disc. You would need to use the standard pytorch load / save functions to store and load the SoftmaxLoss to / from disc.
Best regards Nils Reimers
Thanks for your explain and solution!It is suddenly become extensive.
Maybe you should correct the codes in LabelAccuracyEvaluator.py def init()
黄洋
------------------ 原始邮件 ------------------ 发件人: "Zarmeen"<notifications@github.com>; 发送时间: 2019年11月2日(星期六) 凌晨0:06 收件人: "UKPLab/sentence-transformers"<sentence-transformers@noreply.github.com>; 抄送: "黄洋"<840499869@qq.com>;"Mention"<mention@noreply.github.com>; 主题: Re: [UKPLab/sentence-transformers] error in LabelAccuracyEvaluator.py (#27)
Hi @HUSTHY and @nreimers I install the package from source. And made the suggested changes in the LabelAccuracyEvaluator code but still I am getting this error
— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub, or unsubscribe.
I review this code. Using model.fit() and model = SentenceTransformer(modules=[word_embedding_model, pooling_model]),it works, parameters in train_loss.classifier are updated. In SoftmaxLoss.py and Class SoftMaxLoss(nn.Module), classifier = nn.linear() is defined. When we use function model.fit(train_loss,...), in function fit() you will see how those params are trained. If you still do not understand, you had better get the explain from the authors.
------------------ 原始邮件 ------------------ 发件人: "Shuhuai Ren"<notifications@github.com>; 发送时间: 2019年12月11日(星期三) 下午3:14 收件人: "UKPLab/sentence-transformers"<sentence-transformers@noreply.github.com>; 抄送: "黄洋"<840499869@qq.com>;"Mention"<mention@noreply.github.com>; 主题: Re: [UKPLab/sentence-transformers] error in LabelAccuracyEvaluator.py (#27)
I found a bug in LabelAccuracyEvaluator.py files in v0.2.4 branch. _, prediction = self.softmaxmodel(features, labels=None) It should fixed like this: , prediction = self.softmax_model.to(self.device)(features, labels=None)
And I have a question. When I use model.fit() and then use LabelAccuracyEvaluator, the accuracy is 0.915. However, when I and load the saved fine-tune model and just only use LabelAccuracyEvaluator to evaluate the same data , the accuracy is 0.51. Is there something wrong? My code is like this:
batch_size = 16 nli_reader = LCQMCDataReader('datasets/patentData') model_save_path='output/training_patent_sbert-Chinese-BERT-wwm2019-09-23_13-11-58_with_15K_Trains' word_embedding_model=models.BERT('output/training_patent_sbert-Chinese-BERT-wwm2019-09-23_14-50-35_with_15K_Trains/0_BERT')
Apply mean pooling to get one fixed sized sentence vector
pooling_model = models.Pooling(word_embedding_model.get_word_embedding_dimension(), pooling_mode_mean_tokens=True, pooling_mode_cls_token=False, pooling_mode_max_tokens=False) model = SentenceTransformer(modules=[word_embedding_model, pooling_model]) train_loss= losses.SoftmaxLoss(model=model, sentence_embedding_dimension=model.get_sentence_embedding_dimension(), num_labels=2)
test_data = SentencesDataset(examples=nli_reader.get_examples("test.csv"), model=model) test_dataloader = DataLoader(test_data, shuffle=False, batch_size=batch_size)
evaluator=LabelAccuracyEvaluator(test_dataloader,model_save_path) model.evaluate(evaluator)
@HUSTHY Hi~ I'm curious about how do you train this model? use model.fit() and model = SentenceTransformer(modules=[word_embedding_model, pooling_model])? If so, how can you update the parameters in train_loss.classifier? You know this is a nn.linear model with its own parameters, and I think these parameters can not be updated by model.fit... How can you get accuracy 0.915? Looking forward to your reply, thanks very much~
— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub, or unsubscribe.
Hi @nreimers,
I am having the following issue when trying to use LabelAccuracyEvaluator. I am not sure if it has something to do with the previous reported issues.
_Traceback (most recent call last):
File "D:/PycharmProjects/BERT NLP Tests/AllNLI/SentenceTransformers Fine-tuning - PT_BR V0.py", line 208, in
Here's the part of the code that maybe related to the command with the issue:
**... train_dataset = SentencesDataset(train_samples, model=model) train_dataloader = DataLoader(train_dataset, shuffle=True, batch_size=train_batch_size) train_loss = losses.SoftmaxLoss(model=model, sentence_embedding_dimension=model.get_sentence_embedding_dimension(), num_labels=len(label2int)) ... for index, row in reader.iterrows(): label_id = label2int[row['entailment']] test_samples.append(InputExample(texts=[row['sentence1'], row['sentence2']], label=label_id))
model = SentenceTransformer(model_save_path)
test_evaluator = LabelAccuracyEvaluator.from_input_examples(test_samples, batch_size=train_batch_size, name='assin-nli-test')
dev_dataloader = DataLoader(dev_dataset, shuffle=False, batch_size=train_batch_size) dev_evaluator = LabelAccuracyEvaluator(dev_dataloader, softmax_model=train_loss) ..
Any ideas about its cause?
Thank you in advance.
Try to use the most recent version of this evaluator: https://github.com/UKPLab/sentence-transformers/blob/master/sentence_transformers/evaluation/LabelAccuracyEvaluator.py
It is not yet part of a release.
Perfect @nreimers, it solved the problem. Thank you so much!
Sorry @nreimers, I've managed to get accuracy from train and dev datasets but I am having an issue to get it from the test dataset. I am trying to follow the pattern used on dev, but I may be missing something... Here's the error and the code:
_Traceback (most recent call last):
File "D:/PycharmProjects/BERT NLP Tests/AllNLI/SentenceTransformers Fine-tuning - PT_BR V0.py", line 246, in
**... for index, row in reader.iterrows(): label_id = label2int[row['entailment']] test_samples.append(InputExample(texts=[row['sentence1'], row['sentence2']], label=label_id))
test_dataset = SentencesDataset(examples=test_samples, model=model) test_dataloader = DataLoader(test_dataset, shuffle=False, batch_size=train_batch_size)
test_evaluator = LabelAccuracyEvaluator(test_dataloader, model_save_path) model.evaluate(test_evaluator)** ...
Thank you in advance.
The softmax model is not part of what is stored when the model is trained.
If you want to use it, you must store it by your self, load it and add it to the LabelAccuracyEvaluator when you run it for the test set.
Thank you!
I was reading other posts related to the subject and I may not need to have LabelAccuracyEvaluator accuracy at test time.
In fact, I need to fine tune bert-base-multilingual-cased (and a brazilian portuguese version of it) to improve its sentence embeddings for a sentence textual similarity task (find similar short texts in a text vector). I already tested the vanilla embeddings but they aren't good enough for my purpose.
Therefore I am labeling 1.000 sentence pairs with either similar or dissimilar labels (0 or 1) and intend to fine tune the vanilla model (adjusted for sentence embeddings) with this data set in order to check it's accuracy and improve the Bert embeddings further.
Maybe the cosine similarity loss function and the BinaryClassificationEvaluator class can solve my problem if I split the data set in train, dev and test and run it pretty much as the Training Overview tutorial states. Does this strategy make sense?
Thank you in advance for your assistance.
The codes in line 53 in LabelAccuracyEvaluator.py : _, prediction = model(features[0]) It does not work. When I run this code,error occurs.