Error when uploading model checkpoints to Weights & Biases

simonschoe commented 11 months ago

When settting os.environ["WANDB_LOG_MODEL"] = "end" prior to the training loop and specify report_to='wandb' in TrainingArguments, I receive the following error:

Loading best SentenceTransformer model from step 14551.
Traceback (most recent call last):
  File "<ipython-input-10-8c24ca6359e0>", line 31, in main
    trainer.train()
  File "/usr/local/lib/python3.10/dist-packages/setfit/trainer.py", line 410, in train
    self.train_embeddings(*full_parameters, args=args)
  File "/usr/local/lib/python3.10/dist-packages/setfit/trainer.py", line 463, in train_embeddings
    self._train_sentence_transformer(
  File "/usr/local/lib/python3.10/dist-packages/setfit/trainer.py", line 687, in _train_sentence_transformer
    self.control = self.callback_handler.on_train_end(args, self.state, self.control)
  File "/usr/local/lib/python3.10/dist-packages/transformers/trainer_callback.py", line 366, in on_train_end
    return self.call_event("on_train_end", args, state, control)
  File "/usr/local/lib/python3.10/dist-packages/transformers/trainer_callback.py", line 407, in call_event
    result = getattr(callback, event)(
  File "/usr/local/lib/python3.10/dist-packages/transformers/integrations/integration_utils.py", line 771, in on_train_end
    fake_trainer = Trainer(args=args, model=model, tokenizer=tokenizer)
  File "/usr/local/lib/python3.10/dist-packages/transformers/trainer.py", line 337, in __init__
    enable_full_determinism(self.args.seed) if self.args.full_determinism else set_seed(self.args.seed)
AttributeError: 'TrainingArguments' object has no attribute 'full_determinism'

Occurs only when I set the WANDB_LOG_MODEL environment variable, otherwise it runs smoothly.

tomaarsen commented 10 months ago

Hello!

Hmmm, this is a tricky one. To give a bit of context, setfit borrows the Callbacks from transformers, even though setfit uses a different Trainer and TrainingArguments. So far I've been able to avoid any issues by also implementing certain variables on the setfit TrainingArguments (e.g. report_to, run_name, etc.), but it seems that the WandbCallback on_train_end for some reason loads the transformers Trainer here, and there's not too much I can do about that.

I don't believe I can reasonably fix this I'm afraid.

Tom Aarsen

Ulipenitz commented 8 months ago

You could try overwritting the class method on_train_end. I had a similar issue when using NeptuneCallback, because the TrainingArguments class in SetFit is missing overwrite_output_dir as an argument:

 File "C:\***\lib\site-packages\transformers\integrations\integration_utils.py", line 1343, in on_init_end
    if self._log_checkpoints and (args.overwrite_output_dir or args.save_total_limit is not None):
AttributeError: 'TrainingArguments' object has no attribute 'overwrite_output_dir'

To overcome this I just removed args.overwrite_output_dir, because I did not need it in my case:

from transformers.integrations import NeptuneCallback

class NeptuneCallbackSetFit(NeptuneCallback):
    def on_init_end(self, args, state, control, **kwargs):
        self._volatile_checkpoints_dir = None
        if self._log_checkpoints and (args.save_total_limit is not None):
            self._volatile_checkpoints_dir = tempfile.TemporaryDirectory().name

        if self._log_checkpoints == "best" and not args.load_best_model_at_end:
            raise ValueError("To save the best model checkpoint, the load_best_model_at_end argument must be enabled.")

I don't know your usecase, but maybe removing or exchanging the Trainer in WandbCallback helps!

nurav97 commented 7 months ago

@simonschoe did you find any solution for this issue.

huggingface / setfit

Error when uploading model checkpoints to Weights & Biases #464