model.train_model(train_df, eval_data=eval_df, **eval_metrics) is giving error "All arrays must be of the same length"

Describe the bug A clear and concise description of what the bug is. Please specify the class causing the issue. I am trying to fine-tune an mT5 model while monitoring some custom evaluation metrics. When train_model starts, I am getting an error "All arrays must be of the same length". I tried to run train_model without the custom eval_metrics and training run successfully without the above error.

To Reproduce

model_args = T5Args()
model_args.max_seq_length = 96  
model_args.train_batch_size = 40  
model_args.eval_batch_size = 20                                   
model_args.num_train_epochs = 2                                  
model_args.evaluate_during_training = True
model_args.evaluate_during_training_steps = 1
model_args.use_multiprocessing = False #False
model_args.fp16 = False
model_args.save_steps = -1
model_args.save_eval_checkpoints = True #False
model_args.no_cache = True
model_args.reprocess_input_data = True
model_args.overwrite_output_dir = True
model_args.preprocess_inputs = False #False
model_args.num_return_sequences = 1
model_args.wandb_project = "Exp 15 - MT5 Arabic IslamWeb_Seq_96_train_batch_40f"
model_args.use_cuda = True
model_args.output_dir = pathDrive2 + "/model"
model_args.verbose = True

wandb_kwargs = {}
model_args.wandb_kwargs = wandb_kwargs

eval_metrics = {
    "roc_auc": sklearn.metrics.roc_auc_score,
    "avg_prc": sklearn.metrics.average_precision_score,   
}

model.train_model(train_df, eval_data=eval_df, **eval_metrics)

Error stack received:

---------------------------------------------------------------------------
ValueError                                Traceback (most recent call last)
[<ipython-input-34-97b14fabdaa4>](https://localhost:8080/#) in <cell line: 98>()
     96 # model.train_model(train_df, acc=metric_score)
     97 # model.train_model(train_df, eval_data=eval_df, **eval_metrics)
---> 98 model.train_model(train_df, eval_data=eval_df, **eval_metrics)
     99 # model.train_model(train_df, eval_data=eval_df)
    100 # model.train_model(train_df, acc=sklearn.metrics.accuracy_score)

5 frames
[/usr/local/lib/python3.10/dist-packages/simpletransformers/t5/t5_model.py](https://localhost:8080/#) in train_model(self, train_data, output_dir, show_running_loss, args, eval_data, verbose, **kwargs)
    227         os.makedirs(output_dir, exist_ok=True)
    228 
--> 229         global_step, training_details = self.train(
    230             train_dataset,
    231             output_dir,

[/usr/local/lib/python3.10/dist-packages/simpletransformers/t5/t5_model.py](https://localhost:8080/#) in train(self, train_dataset, output_dir, show_running_loss, eval_data, verbose, **kwargs)
    634                         for key in results:
    635                             training_progress_scores[key].append(results[key])
--> 636                         report = pd.DataFrame(training_progress_scores)
    637                         report.to_csv(
    638                             os.path.join(

[/usr/local/lib/python3.10/dist-packages/pandas/core/frame.py](https://localhost:8080/#) in __init__(self, data, index, columns, dtype, copy)
    662         elif isinstance(data, dict):
    663             # GH#38939 de facto copy defaults to False only in non-dict cases
--> 664             mgr = dict_to_mgr(data, index, columns, dtype=dtype, copy=copy, typ=manager)
    665         elif isinstance(data, ma.MaskedArray):
    666             import numpy.ma.mrecords as mrecords

[/usr/local/lib/python3.10/dist-packages/pandas/core/internals/construction.py](https://localhost:8080/#) in dict_to_mgr(data, index, columns, dtype, typ, copy)
    491             arrays = [x.copy() if hasattr(x, "dtype") else x for x in arrays]
    492 
--> 493     return arrays_to_mgr(arrays, columns, index, dtype=dtype, typ=typ, consolidate=copy)
    494 
    495 

[/usr/local/lib/python3.10/dist-packages/pandas/core/internals/construction.py](https://localhost:8080/#) in arrays_to_mgr(arrays, columns, index, dtype, verify_integrity, typ, consolidate)
    116         # figure out the index, if necessary
    117         if index is None:
--> 118             index = _extract_index(arrays)
    119         else:
    120             index = ensure_index(index)

[/usr/local/lib/python3.10/dist-packages/pandas/core/internals/construction.py](https://localhost:8080/#) in _extract_index(data)
    664             lengths = list(set(raw_lengths))
    665             if len(lengths) > 1:
--> 666                 raise ValueError("All arrays must be of the same length")
    667 
    668             if have_dicts:

ValueError: All arrays must be of the same length

Expected behavior Fine tuning should run as expected without the error

Screenshots Datasets used during training (no null values inside dataset):

Desktop (please complete the following information): The code is running in a google Collab Pro+ environment

Additional context Not Applicable

Thanks Thilina for your fast response, I tried your suggestion; however, now I am getting a new error "TypeError: list indices must be integers or slices, not str" although that my training and evaluation datasets fields are all string.

The github issues log have a similar issue #1524 but still it is not answered and it refers to the following commit: https://github.com/ThilinaRajapakse/simpletransformers/commit/3f75da042a8ca1200126eb16bf8ca00d7c67854b

Appreciating your support,

---------------------------------------------------------------------------
TypeError                                 Traceback (most recent call last)
[<ipython-input-96-e42a27b68e46>](https://localhost:8080/#) in <cell line: 105>()
--> 105 model.train_model(train_df, eval_data=eval_df, f1=f1_multiclass)

3 frames
[/usr/local/lib/python3.10/dist-packages/simpletransformers/t5/t5_model.py](https://localhost:8080/#) in train_model(self, train_data, output_dir, show_running_loss, args, eval_data, verbose, **kwargs)
    227         os.makedirs(output_dir, exist_ok=True)
    228 
--> 229         global_step, training_details = self.train(
    230             train_dataset,
    231             output_dir,

[/usr/local/lib/python3.10/dist-packages/simpletransformers/t5/t5_model.py](https://localhost:8080/#) in train(self, train_dataset, output_dir, show_running_loss, eval_data, verbose, **kwargs)
    603                     ):
    604                         # Only evaluate when single GPU otherwise metrics may not average well
--> 605                         results = self.eval_model(
    606                             eval_data,
    607                             verbose=verbose and args.evaluate_during_training_verbose,

[/usr/local/lib/python3.10/dist-packages/simpletransformers/t5/t5_model.py](https://localhost:8080/#) in eval_model(self, eval_data, output_dir, verbose, silent, **kwargs)
    934                     prefix + input_text
    935                     for prefix, input_text in zip(
--> 936                         eval_dataset["prefix"], eval_dataset["input_text"]
    937                     )
    938                 ]

[/usr/local/lib/python3.10/dist-packages/simpletransformers/t5/t5_utils.py](https://localhost:8080/#) in __getitem__(self, index)
    197 
    198     def __getitem__(self, index):
--> 199         return self.examples[index]

TypeError: list indices must be integers or slices, not str

ThilinaRajapakse / simpletransformers

model.train_model(train_df, eval_data=eval_df, **eval_metrics) is giving error "All arrays must be of the same length" #1531