huggingface / autotrain-advanced

🤗 AutoTrain Advanced
https://huggingface.co/autotrain
Apache License 2.0
3.63k stars 441 forks source link

Add metric_for_best_model="loss" as default in interface and add note on default metric in model card #497

Open MoritzLaurer opened 5 months ago

MoritzLaurer commented 5 months ago

Feature Request

  1. I'd suggest adding the argument "metric_for_best_model="loss"" as a default value in the interface for hyperparameters to make it clear to people that the default is the loss and to enable people to change it easily.

  2. I'd suggest adding an explicit note in the automatically generated model card about the metric which was used to choose the final model thats uploaded to the hub. This makes sure that users with less technical background / who didn't check the logs understand that the uploaded model might actually not be the most accurate/performant model, but it's only the model with the lowest loss.

Motivation

I understand that loss is a good default value given the many different possible tasks and models. At the same time, there are many tasks (like classification) where it's important not to take loss as the metric to choose the model. I'm afraid that many users will not make the effort of looking into the logs to see that autotrain might have actually trained a better model on relevant metrics. Similar for the model cards: explicitly stating which metric was used to select this model makes sure that people are aware that autotrain might have resulted in other models with better metrics on something else than loss.

Additional Context

No response

MoritzLaurer commented 5 months ago

FYI: I just did another training run and manually specified metric_for_best_model="f1_macro" and for some reason it still selected the model with lowest loss. I'm not really sure why. Here are the training parameters I put in the UI:

{ "lr": 2e-5, "epochs": 10, "max_seq_length": 256, "metric_for_best_model": "f1_macro", "batch_size": 16, "warmup_ratio": 0.1, "gradient_accumulation": 1, "optimizer": "adamw_torch", "scheduler": "linear", "weight_decay": 0, "max_grad_norm": 1, "seed": 42, "logging_steps": -1, "auto_find_batch_size": false, "mixed_precision": "fp16", "save_total_limit": 2, "save_strategy": "epoch", "evaluation_strategy": "epoch" }

abhishekkrthakur commented 5 months ago

params not available in the backend cannot be used. i can work on adding this next week :)

github-actions[bot] commented 4 months ago

This issue is stale because it has been open for 15 days with no activity.

geegee4iee commented 4 months ago

Hi @abhishekkrthakur, is there any update on this?

abhishekkrthakur commented 4 months ago

Hopefully in next release

github-actions[bot] commented 3 months ago

This issue is stale because it has been open for 15 days with no activity.

github-actions[bot] commented 3 months ago

This issue was closed because it has been inactive for 2 days since being marked as stale.

MoritzLaurer commented 3 months ago

reopening this, but no time-pressure / immediate need from my side @abhishekkrthakur

github-actions[bot] commented 2 months ago

This issue is stale because it has been open for 15 days with no activity.

abhishekkrthakur commented 2 months ago

open

github-actions[bot] commented 1 month ago

This issue is stale because it has been open for 30 days with no activity.

github-actions[bot] commented 3 weeks ago

This issue was closed because it has been inactive for 20 days since being marked as stale.