huggingface / transformers

🤗 Transformers: State-of-the-art Machine Learning for Pytorch, TensorFlow, and JAX.
https://huggingface.co/transformers
Apache License 2.0
135.85k stars 27.19k forks source link

cannot pass string values to the report_to variable after init constructor #34445

Open alpertunga-bile opened 1 month ago

alpertunga-bile commented 1 month ago

System Info

Who can help?

@muellerzr @SunMarc

Information

Tasks

Reproduction

  1. Install wandb package
  2. Run the below script with any model (model initialization is not added here for brevity)
    
    default_train_args = TrainingArguments(output_dir="tmp")

default_train_args.report_to = "none"

trainer = Trainer(model=model, args=default_train_args)


### Expected behavior

The script will run without any integration when ```report_to``` value is assigned with ```none``` value.

## Problem
The check for the ```report_to```  variable is done in the ```post_init``` function of the ```TrainingArguments```. If you don't specify it in the constructor then the code set to ```all``` and fetch all the integrations. If you assing a value to ```report_to``` variable after initialization constructor, the below issue is happening.

In the ```Trainer``` constructor, with ```get_reporting_integration_callbacks``` function the class is fetching the integration callbacks but the function is not checking if string or array is the type of the ```report_to``` value. Because of that the functions raises ValueError if you assign string to the value because it is trying to check with char rather than string. This behaviour is same for the ```none``` and ```all``` string values.

- The responsible code parts for the issue can be found below:
  - [[code](https://github.com/huggingface/transformers/blob/main/src/transformers/training_args.py#L1821)] TrainingArguments
  - [[code](https://github.com/huggingface/transformers/blob/main/src/transformers/trainer.py#L627)] Trainer
  - [[code](https://github.com/huggingface/transformers/blob/main/src/transformers/integrations/integration_utils.py#L2149)] Integration Utils

## Solution
Adding below code to the beginning of the [get_reporting_integration_callbacks](https://github.com/huggingface/transformers/blob/main/src/transformers/integrations/integration_utils.py#L2149) function will fix the issue:

```python
if report_to is None:
     return []

if isinstance(report_to, str):
    if "none" == report_to:
        return []
    elif "all" == report_to:
        report_to = get_available_reporting_integrations()
    else:
        report_to = [report_to]
LysandreJik commented 1 month ago

cc @muellerzr WDYT?

github-actions[bot] commented 6 days ago

This issue has been automatically marked as stale because it has not had recent activity. If you think this still needs to be addressed please comment on this thread.

Please note that issues that do not follow the contributing guidelines are likely to be ignored.

SunMarc commented 6 days ago

Hey @alpertunga-bile, is there any reason why you want to set report_to after initializing default_train_args ?

alpertunga-bile commented 6 days ago

Hello @SunMarc, I am training 5 models with a single script in Google Colab, so I need to change the values ​​according to the selected model. Since there are multiple condition checks and the checks affect multiple values; I am assigning them after the init ctor.

Before the updates I had no problem but later it tries to use the wandb package since it is pre-installed in Google Colab. I followed the documentation and since I am following this assignment pattern I tried assigning the value none to the return_to variable after the init ctor then this issue occurred.

I think the solution will provide flexibility and protection from the side effect of the constructor to the user who follows the documentation and uses this assignment pattern. However it seems that this is a rare issue and assignment pattern so the solution will not have any impact on the user experience.