UKPLab / sentence-transformers

State-of-the-Art Text Embeddings
https://www.sbert.net
Apache License 2.0
15.04k stars 2.45k forks source link

Print loss when training on TSDAE #1389

Open polarcrow opened 2 years ago

polarcrow commented 2 years ago

Hi,

Is there a way to print the training loss as well as the evaluation loss when training on the TSDAE task please ?

Have a nice day and thanks a lot for the amazing repository.

nreimers commented 2 years ago

Hi, maybe @kwang2049 could help here

kwang2049 commented 2 years ago

Hi @polarcrow,

Thanks for your attention!

To obtain the loss value, one can build the features for the sentences and call loss_objective.forward to compute the loss. For example, one can do this:

from sentence_transformers import SentenceTransformer, LoggingHandler
from sentence_transformers import models, util, datasets, evaluation, losses
import torch

# Define your sentence transformer model using CLS pooling
model_name = 'bert-base-uncased'
word_embedding_model = models.Transformer(model_name)
pooling_model = models.Pooling(word_embedding_model.get_word_embedding_dimension(), 'cls')
model = SentenceTransformer(modules=[word_embedding_model, pooling_model])

# Define a list with sentences (1k - 100k sentences)
train_sentences = ["Your set of sentences",
                   "Model will automatically add the noise", 
                   "And re-construct it",
                   "You should provide at least 1k sentences"]

# Build damaged inputs (i.e. the labels for TSDAE)
train_sentences_damaged = list(map(datasets.DenoisingAutoEncoderDataset.delete, train_sentences))

# Build features
features = [model(model.tokenize(train_sentences)), model(model.tokenize(train_sentences_damaged))]

train_loss = losses.DenoisingAutoEncoderLoss(model, decoder_name_or_path=model_name, tie_encoder_decoder=True)
print(train_loss(features, None))

Notice that the decoder (especially the cross_attention parameters for modeling seq2seq) will not be saved in the SBERT checkpoint folder, so one cannot load a trained TSDAE checkpoint and compute the loss.

To solve this and get the evaluation loss, one can either (1) do the same thing above in the code snippet directly on the evaluation data after training (with everything still in the memory); (2) save & load the train_loss objective with torch's support and do the same thing again.

Permafacture commented 1 year ago

It doesn't seem like this really answers the question. It would be helpful to see the training and/or validation loss throughout the fitting process. Seems like an easy solution would be an evaluator that wraps any loss. Then all the evaluation and losses you want to see can be added to a single evaluation.SequentialEvaluator

yordanovn commented 1 year ago

Hey @kwang2049, in your example you're creating the features list as: features = [model(model.tokenize(train_sentences)), model(model.tokenize(train_sentences_damaged))] However, in the DenoisingAutoEncoderLoss I see the the features are unpacked as: source_features, target_features = tuple(sentence_features) Doesn't that mean that in your example you should build the list in reverse order? features = [model(model.tokenize(train_sentences_damaged)), model(model.tokenize(train_sentences))]

netapy commented 1 year ago

Any other insights on how to print the loss during the fitting process ? Thanks

kwang2049 commented 1 year ago

Any other insights on how to print the loss during the fitting process ? Thanks

I think one can inherit the DenoisingAutoEncoderLoss class and re-write the forward function by adding a loss logging line before it returns the loss

netapy commented 1 year ago

Any other insights on how to print the loss during the fitting process ? Thanks

I think one can inherit the DenoisingAutoEncoderLoss class and re-write the forward function by adding a loss logging line before it returns the loss

Oh right absolutely thanks !

class LoggingDenoisingAutoEncoderLoss(losses.DenoisingAutoEncoderLoss):
    def forward(self, sentence_features, labels):
        loss_value = super().forward(sentence_features, labels)

        logging.info(f'Loss: {loss_value.item()}')

        return loss_value

train_loss = LoggingDenoisingAutoEncoderLoss(model, decoder_name_or_path=model_name, tie_encoder_decoder=True)

Seems to do it just fine

MikolajJedrzejewski commented 8 months ago

Hi @polarcrow,

Thanks for your attention!

To obtain the loss value, one can build the features for the sentences and call loss_objective.forward to compute the loss. For example, one can do this:

from sentence_transformers import SentenceTransformer, LoggingHandler
from sentence_transformers import models, util, datasets, evaluation, losses
import torch

# Define your sentence transformer model using CLS pooling
model_name = 'bert-base-uncased'
word_embedding_model = models.Transformer(model_name)
pooling_model = models.Pooling(word_embedding_model.get_word_embedding_dimension(), 'cls')
model = SentenceTransformer(modules=[word_embedding_model, pooling_model])

# Define a list with sentences (1k - 100k sentences)
train_sentences = ["Your set of sentences",
                   "Model will automatically add the noise", 
                   "And re-construct it",
                   "You should provide at least 1k sentences"]

# Build damaged inputs (i.e. the labels for TSDAE)
train_sentences_damaged = list(map(datasets.DenoisingAutoEncoderDataset.delete, train_sentences))

# Build features
features = [model(model.tokenize(train_sentences)), model(model.tokenize(train_sentences_damaged))]

train_loss = losses.DenoisingAutoEncoderLoss(model, decoder_name_or_path=model_name, tie_encoder_decoder=True)
print(train_loss(features, None))

Notice that the decoder (especially the cross_attention parameters for modeling seq2seq) will not be saved in the SBERT checkpoint folder, so one cannot load a trained TSDAE checkpoint and compute the loss.

To solve this and get the evaluation loss, one can either (1) do the same thing above in the code snippet directly on the evaluation data after training (with everything still in the memory); (2) save & load the train_loss objective with torch's support and do the same thing again.

Hello, Do you have details or specific how to save and load train_loss objective or what do you mean by that? I tried to save entire object with pickle.save() and torch.save() but without success, after loading such object it still produces different loss results on the same validation set. I have no idea what could be the reason behind this...

# First I save the training loss object after training
model.fit()
pickle.dump(train_loss, "model_train_loss.pkl")

And then I try to use it like this:

train_loss = pickle.load("model_train_loss.pkl")
print(train_loss(features, None))

But as I said before this gives different results each time.