center-for-humans-and-machines / transformer-heads

Toolkit for attaching, training, saving and loading of new heads for transformer models
https://transformer-heads.readthedocs.io/en/latest/
MIT License
236 stars 21 forks source link

Unable to Visualise the Validation Loss #7

Open cccsurrey opened 3 weeks ago

cccsurrey commented 3 weeks ago

Hi,

I am trying to visualise the validation loss together with the train loss in training argument. However, the panel shows "no log" in validation loss at each checkpoint. Do I have to manually use the script provided to validate the loss instead of using Transformers library? I make sure I did everything correct but it was not showing

args = TrainingArguments(
    output_dir="./gemma2-2b-mean-full_finetune",
    learning_rate=2e-5,
    num_train_epochs=train_epochs,  # To speed things up set to 0.1, set to 1 for better performance
    logging_steps=20,
    do_eval=True,
    evaluation_strategy='steps',
    remove_unused_columns=False,
    optim="paged_adamw_32bit",
    save_strategy="steps",
    gradient_checkpointing=False,
    lr_scheduler_type="constant",
    ddp_find_unused_parameters=False,
    per_device_train_batch_size=4,
    per_device_eval_batch_size=4,
    save_steps=20,
    eval_steps=20,
    logging_dir="./logs",
)
trainer = Trainer(
    model,
    args=args,
    train_dataset=datasets[0],
    eval_dataset=datasets[2],
    data_collator=collator,
    compute_metrics=compute_metrics,
)
trainer.train()
yannikkellerde commented 2 weeks ago

Cannot reproduce on my side. Your TrainingArguments seem correct. But I do not think that is a transformer_heads issue. I suspect it has something to do with the compute_metrics function you are passing or it is some wandb issue.

cccsurrey commented 2 weeks ago

Thanks for your response. Are you able to visualise the validation loss on your side? I have tried to disable the compute_metrics and it still does not work unfortunately.

yannikkellerde commented 2 weeks ago

Yes, I am actively using transformer_heads and evaluation loss is showing.

cccsurrey commented 2 weeks ago

Do you mind sharing your sample code here and so I can reproduce it? Thanks a lot