huggingface / transformers

🤗 Transformers: State-of-the-art Machine Learning for Pytorch, TensorFlow, and JAX.
https://huggingface.co/transformers
Apache License 2.0
135.1k stars 27.04k forks source link

Log multiple losses used along with the combined losses when a model returns a dictionary of losses. #31081

Open NikhilMank opened 5 months ago

NikhilMank commented 5 months ago

Feature request

Able to log individual losses returned as dict.

Motivation

I have multiple losses that are being added to form a combined loss. I want to log all these individual losses to observe the trend of the individual losses. SemanticSegmenterOutput accepts single loss at the moment and logs the loss in the SemanticSegmenterOutput.

Your contribution

I have modified the Trainer class and SemanticSegmenter output as below but it is not working as expected. I have added a few print statements to check if the on_log part is being accessed or not but that code is not even being accessed.

class CustomTrainer(Trainer): def init(self, *args, *kwargs): super().init(args, **kwargs) self.additional_losses = {}

def training_step(self, model, inputs):
    outputs = model(**inputs)

    # Extract the additional losses from the model output
    self.additional_losses = outputs.additional_losses

    # Continue with usual training step process
    return super().training_step(model, inputs)

def on_log(self, global_step, state, control, logs, **kwargs):
    loss_dict = logs.get("loss_dict", {})

    if 'wandb' in self.args.report_to:
        wandb.log({k: v for k, v in loss_dict.items()})

    if 'tensorboard' in self.args.report_to:
        tensorboard_writer = self.state.log_history.get('tensorboard')
        if tensorboard_writer is not None:
            tensorboard_writer.add_scalars('additional_losses', loss_dict, global_step)

    # Call the parent's on_log method
    super().on_log(global_step, state, control, logs, **kwargs)

def on_train_end(self):
    tensorboard_writer = self.state.log_history.get('tensorboard')
    if tensorboard_writer is not None:
        tensorboard_writer.close()

@dataclass class CustomSemanticSegmenterOutput(SemanticSegmenterOutput): additional_losses: Optional[Dict[str, torch.FloatTensor]] = None

dkrystki commented 4 months ago

I'm also interested in logging multiple loss values

amyeroberts commented 4 months ago

cc @muellerzr @SunMarc

NikhilMank commented 4 months ago

@dkrystki @amyeroberts I have solved this issue. I have created a custom trainer class inheriting the 'class Trainer' and then defined a custom_loss where I have logged the values using self.log()

NikhilMank commented 4 months ago

Now, I wanted to log the weights of the model too but I am unable to do it in the same file, I can do it by creating a different tensorboard events file but I am unable to do it in the same events file like the multiple losses. @muellerz @SunMarc @amyeroberts

hoangdangthien commented 2 months ago

I'm training RT-DETR and follow your instruction. But when open wandb only training loss appear.

NikhilMank commented 2 months ago

@hoangdangthien try ' wandb.log', I used self.log for tensorboard