Calling the training script scripts/train_jat_tokenized.py as given (with --per_device_train_batch_size 1 and one GPU) the following error comes when the system tries to save the first checkpoint:
from trainer.train(..) in above script, end of file:
File "/home/km/.local/lib/python3.10/site-packages/transformers/trainer.py", line 1885, in train
return inner_training_loop(
File "/home/km/.local/lib/python3.10/site-packages/transformers/trainer.py", line 2291, in _inner_training_loop
self._maybe_log_save_evaluate(tr_loss, grad_norm, model, trial, epoch, ignore_keys_for_eval)
File "/home/km/.local/lib/python3.10/site-packages/transformers/trainer.py", line 2732, in _maybe_log_save_evaluate
self._save_checkpoint(model, trial, metrics=metrics)
File "/home/km/.local/lib/python3.10/site-packages/transformers/trainer.py", line 2811, in _save_checkpoint
self.save_model(output_dir, _internal_call=True)
File "/home/km/.local/lib/python3.10/site-packages/transformers/trainer.py", line 3355, in save_model
self._save(output_dir)
File "/home/km/.local/lib/python3.10/site-packages/transformers/trainer.py", line 3432, in _save
self.model.save_pretrained(
File "/home/km/.local/lib/python3.10/site-packages/transformers/modeling_utils.py", line 2574, in save_pretrained
raise RuntimeError(
RuntimeError: The weights trying to be saved contained shared tensors [{'transformer.wte.weight', 'single_discrete_encoder.weight', 'multi_discrete_encoder.0.weight'}] that are mismatching the transformers base configuration. Try saving using safe_serialization=False or remove this tensor sharing.
The error comes up with using accelerate launch and without (just using python Githubissues.
Githubissues is a development platform for aggregating issues.
With transformers 4.41.0., Ubuntu 22.0
Calling the training script scripts/train_jat_tokenized.py as given (with --per_device_train_batch_size 1 and one GPU) the following error comes when the system tries to save the first checkpoint:
The error comes up with using accelerate launch and without (just using python Githubissues.