RUCAIBox / TextBox

TextBox 2.0 is a text generation library with pre-trained language models
https://github.com/RUCAIBox/TextBox
MIT License
1.07k stars 117 forks source link

accelerate报错 #362

Closed Foehnc closed 11 months ago

Foehnc commented 11 months ago

描述这个 bug 按照install.sh配置环境,accelerate使用0.23.0会有以下报错: File "/home/workspace/TextBox/textbox/utils/dashboard.py", line 311, in new_experiment yield True File "/home/workspace/TextBox/textbox/quick_start/experiment.py", line 140, in run self._do_train_and_valid() File "/home/workspace/TextBox/textbox/quick_start/experiment.py", line 115, in _do_train_and_valid self.valid_result = self.trainer.fit(train_data, valid_data) File "/home/workspace/TextBox/textbox/trainer/trainer.py", line 452, in fit loss = self._train_epoch(train_data, epoch_idx, valid_data)['loss'] File "/home/workspace/TextBox/textbox/trainer/trainer.py", line 236, in _train_epoch self.accelerator.gradient_state._set_end_of_dataloader(False) AttributeError: 'GradientState' object has no attribute '_set_end_of_dataloader' 安装旧版本accelerate 0.20.3(与要求的环境匹配的最低版本)依然会报这个错。

如何复现 python run_textbox.py \ --use_gpu=True \ --gpu_id=1 \ --model=Chinese-BART \ --model_path=pretrained_models/bart-base-chinese \ --dataset=csl \ --do_train=True \ --do_valid=True \ --do_test=True \ --epochs=5 \ --train_batch_size=32 \ --eval_batch_size=32 \ --max_save=0 \ --valid_strategy=epoch \ --valid_steps=1 \ --filename=DEBUG \ --wandb=disabled \

1190303125 commented 11 months ago

请问下是单独使用这行python文件还是使用了accelerate库,如https://github.com/RUCAIBox/TextBox/blob/2.0.0/asset/efficient_training.md

1190303125 commented 11 months ago

你好,应该是版本的问题,accelerate中GradientState 删除了这个函数,你可以回退到accelerate 0.15.0版本。如果你不使用accelerate库的话,可以把删除代码 self.accelerator.gradient_state._set_end_of_dataloader(False)