boun-tabi-LMG / turkish-lm-tuner

Turkish LM Tuner
https://boun-tabi-lmg.github.io/turkish-lm-tuner/
MIT License
73 stars 6 forks source link

Bugfix: TypeError during evaluation in Conditional Generation mode #31

Closed zeynepyirmibes closed 6 months ago

zeynepyirmibes commented 6 months ago

The following error was received during evaluation in Conditional Generation mode for the Product Reviews and Tweet Sentiment datasets:

Bug:

Traceback (most recent call last):
  File "/users/zeynep.yirmibesoglu/turkish-lm-tuner/experiments/finetune.py", line 92, in main
    trainer, model = model_trainer.train_and_evaluate(train_dataset, eval_dataset, test_dataset)
  File "/users/zeynep.yirmibesoglu/turkish-lm-tuner/turkish_lm_tuner/trainer.py", line 101, in train_and_evaluate
    trainer.train()
  File "/opt/python3/venv/base/lib/python3.10/site-packages/transformers/trainer.py", line 1537, in train
    return inner_training_loop(
  File "/opt/python3/venv/base/lib/python3.10/site-packages/transformers/trainer.py", line 1929, in _inner_training_loop
    self._maybe_log_save_evaluate(tr_loss, model, trial, epoch, ignore_keys_for_eval)
  File "/opt/python3/venv/base/lib/python3.10/site-packages/transformers/trainer.py", line 2268, in _maybe_log_save_evaluate
    metrics = self.evaluate(ignore_keys=ignore_keys_for_eval)
  File "/opt/python3/venv/base/lib/python3.10/site-packages/transformers/trainer_seq2seq.py", line 166, in evaluate
    return super().evaluate(eval_dataset, ignore_keys=ignore_keys, metric_key_prefix=metric_key_prefix)
  File "/opt/python3/venv/base/lib/python3.10/site-packages/transformers/trainer.py", line 3019, in evaluate
    output = eval_loop(
  File "/opt/python3/venv/base/lib/python3.10/site-packages/transformers/trainer.py", line 3310, in evaluation_loop
    metrics = self.compute_metrics(EvalPrediction(predictions=all_preds, label_ids=all_labels))
  File "/users/zeynep.yirmibesoglu/turkish-lm-tuner/turkish_lm_tuner/evaluator.py", line 142, in compute_metrics
    decoded_preds = self.tokenizer.batch_decode(preds, skip_special_tokens=True)
  File "/opt/python3/venv/base/lib/python3.10/site-packages/transformers/tokenization_utils_base.py", line 3710, in batch_decode
    return [
  File "/opt/python3/venv/base/lib/python3.10/site-packages/transformers/tokenization_utils_base.py", line 3711, in <listcomp>
    self.decode(
  File "/opt/python3/venv/base/lib/python3.10/site-packages/transformers/tokenization_utils_base.py", line 3750, in decode
    return self._decode(
  File "/opt/python3/venv/base/lib/python3.10/site-packages/transformers/tokenization_utils_fast.py", line 625, in _decode
    text = self._tokenizer.decode(token_ids, skip_special_tokens=skip_special_tokens)
TypeError: argument 'ids': 'list' object cannot be interpreted as an integer

Reason:

The training argument predict_with_generate was not coming True from the config file. The default conditional generation configuration files for training and testing were switched.

Fix:

I switched the config files back.