yangheng95 / PyABSA

Sentiment Analysis, Text Classification, Text Augmentation, Text Adversarial defense, etc.;
https://pyabsa.readthedocs.io
MIT License
923 stars 159 forks source link

Model saving does not output state_dict #334

Open Kensvin28 opened 1 year ago

Kensvin28 commented 1 year ago

PyABSA Version (Required)

2.3.1

Code To Reproduce (Required)

from pyabsa import ModelSaveOption, DeviceTypeOption import warnings

warnings.filterwarnings("ignore")

config.batch_size = 8 config.patience = 20 config.log_step = -1 config.max_seq_len = 256 config.seed = 1 config.verbose = False # If verbose == True, PyABSA will output the model structure and several processed data examples config.notice = ( "This is an training example for aspect term extraction" # for memos usage )

trainer = ASTE.ASTETrainer( config=config, dataset=dataset,

from_checkpoint="english", # if you want to resume training from our pretrained checkpoints, you can pass the checkpoint name here

auto_device='cuda',  # use cuda if available
checkpoint_save_mode=ModelSaveOption.SAVE_FULL_MODEL,  # save state dict only instead of the whole model
load_aug=False,  # there are some augmentation dataset for integrated datasets, you use them by setting load_aug=True to improve performance

)

Full Console Output (Required)

[2023-07-08 02:27:10] (2.3.1) Set Model Device: cuda [2023-07-08 02:27:10] (2.3.1) Device Name: Tesla T4 2023-07-08 02:27:10,136 INFO: PyABSA version: 2.3.1 2023-07-08 02:27:10,137 INFO: Transformers version: 4.30.2 2023-07-08 02:27:10,138 INFO: Torch version: 2.0.1+cu117+cuda11.7 2023-07-08 02:27:10,138 INFO: Device: Tesla T4 2023-07-08 02:27:10,140 INFO: 407.Shopee in the trainer is not a exact path, will search dataset in current working directory FindFile Warning --> multiple targets ['integrated_datasets/aste_datasets/407.Shopee', 'integrated_datasets/aste_datasets/407.Shopee/.ipynb_checkpoints'] found, only return the shortest path: <integrated_datasets/aste_datasets/407.Shopee> 2023-07-08 02:27:10,146 INFO: You can set load_aug=True in a trainer to augment your dataset (English only yet) and improve performance. 2023-07-08 02:27:11,753 INFO: Load dataset from integrated_datasets/aste_datasets/407.Shopee/train.txt preparing dataloader: 2%|▏ | 10/523 [00:00<00:05, 96.70it/s] EOL while scanning string literal (, line 1) preparing dataloader: 100%|██████████| 523/523 [00:05<00:00, 98.53it/s] 2023-07-08 02:27:18,110 INFO: Load dataset from integrated_datasets/aste_datasets/407.Shopee/test.txt preparing dataloader: 51%|█████▏ | 54/105 [00:00<00:00, 97.42it/s] EOL while scanning string literal (, line 1) preparing dataloader: 100%|██████████| 105/105 [00:01<00:00, 92.39it/s] 2023-07-08 02:27:19,812 INFO: Load dataset from integrated_datasets/aste_datasets/407.Shopee/dev.txt preparing dataloader: 100%|██████████| 71/71 [00:00<00:00, 100.00it/s] building vocab... converting data to features: 100%|██████████| 522/522 [00:31<00:00, 16.66it/s] converting data to features: 100%|██████████| 104/104 [00:07<00:00, 14.29it/s] converting data to features: 100%|██████████| 71/71 [00:03<00:00, 20.75it/s] 2023-07-08 02:28:02,765 INFO: Save cache dataset to emcgcn.407.Shopee.dataset.b58ef8d99282bf35c7523e9d4fe3c00be3acbf79e1c910c9b38732fded1e3432.cache

Some weights of the model checkpoint at yangheng/deberta-v3-base-absa-v1.1 were not used when initializing DebertaV2Model: ['pooler.dense.bias', 'classifier.bias', 'classifier.weight', 'pooler.dense.weight']

AttributeError: 'NoneType' object has no attribute 'seek'

During handling of the above exception, another exception occurred:

AttributeError Traceback (most recent call last) /tmp/ipykernel_469/2016644588.py in <cell line: 16>() 14 ) 15 ---> 16 trainer = ASTE.ASTETrainer( 17 config=config, 18 dataset=dataset,

~/.conda/envs/default/lib/python3.9/site-packages/pyabsa/tasks/AspectSentimentTripletExtraction/trainer/trainer.py in init(self, config, dataset, from_checkpoint, checkpoint_save_mode, auto_device, path_to_save, load_aug) 65 self.config.task_name = TaskNameOption().get(self.config.task_code) 66 ---> 67 self._run()

~/.conda/envs/default/lib/python3.9/site-packages/pyabsa/framework/trainer_class/trainer_template.py in _run(self) 239 self.config.seed = s 240 if self.config.checkpoint_save_mode: --> 241 model_path.append(self.training_instructor(self.config).run()) 242 else: 243 # always return the last trained model if you don't save trained model

~/.conda/envs/default/lib/python3.9/site-packages/pyabsa/tasks/AspectSentimentTripletExtraction/instructor/instructor.py in run(self) 869 # Loss and Optimizer 870 criterion = nn.CrossEntropyLoss(ignore_index=-1) --> 871 return self._train(criterion) 872 873 def _train(self, criterion):

~/.conda/envs/default/lib/python3.9/site-packages/pyabsa/tasks/AspectSentimentTripletExtraction/instructor/instructor.py in _train(self, criterion) 884 return self._k_fold_train_and_evaluate(criterion) 885 else: --> 886 return self._train_and_evaluate(criterion)

~/.conda/envs/default/lib/python3.9/site-packages/pyabsa/tasks/AspectSentimentTripletExtraction/instructor/instructor.py in _train_and_evaluate(self, criterion) 441 "Loading best model: {} and evaluating on test set ".format(save_path) 442 ) --> 443 self._reload_model_state_dict(save_path) 444 joint_precision, joint_recall, joint_f1 = self._evaluate_f1( 445 self.test_dataloader

~/.conda/envs/default/lib/python3.9/site-packages/pyabsa/framework/instructor_class/instructor_template.py in _reload_model_state_dict(self, ckpt) 119 else: 120 self.model.load_state_dict( --> 121 torch.load(find_file(ckpt, or_key=[".bin", "state_dict"])) 122 ) 123

~/.conda/envs/default/lib/python3.9/site-packages/torch/serialization.py in load(f, map_location, pickle_module, weights_only, **pickle_load_args) 789 pickle_load_args['encoding'] = 'utf-8' 790 --> 791 with _open_file_like(f, 'rb') as opened_file: 792 if _is_zipfile(opened_file): 793 # The zipfile reader is going to advance the current file position.

~/.conda/envs/default/lib/python3.9/site-packages/torch/serialization.py in _open_file_like(name_or_buffer, mode) 274 return _open_buffer_writer(name_or_buffer) 275 elif 'r' in mode: --> 276 return _open_buffer_reader(name_or_buffer) 277 else: 278 raise RuntimeError(f"Expected 'r' or 'w' in mode but got {mode}")

~/.conda/envs/default/lib/python3.9/site-packages/torch/serialization.py in init(self, buffer) 259 def init(self, buffer): 260 super().init(buffer) --> 261 _check_seekable(buffer) 262 263

~/.conda/envs/default/lib/python3.9/site-packages/torch/serialization.py in _check_seekable(f) 355 return True 356 except (io.UnsupportedOperation, AttributeError) as e: --> 357 raise_err_msg(["seek", "tell"], e) 358 return False 359

~/.conda/envs/default/lib/python3.9/site-packages/torch/serialization.py in raise_err_msg(patterns, e) 348 + " Please pre-load the data into a buffer like io.BytesIO and" 349 + " try to load from it instead.") --> 350 raise type(e)(msg) 351 raise e 352

AttributeError: 'NoneType' object has no attribute 'seek'. You can only torch.load from a file that is seekable. Please pre-load the data into a buffer like io.BytesIO and try to load from it instead.

Describe the bug

I don't know why, but the model does not save the state_dict file now, so every time it is testing after training, it shows this error.

Expected behavior

Should be able to get state_dict when saving trained model and execute the testing without error.

yangheng95 commented 1 year ago

Could you please check your torch version and transformers version? And can you check if the .state_dict exists in your file system?

Kensvin28 commented 1 year ago

torch 2.0.1 transformers 4.30.2 no state_dict in the file system

yangheng95 commented 1 year ago

Can you try transformers=4.30.0?

Kensvin28 commented 1 year ago

I tried transformers 4.30.0, but it still shows the same error.

[2023-07-10 15:21:36] (2.3.1) PyABSAVersion:2.3.1 --> Calling Count:1 [2023-07-10 15:21:36] (2.3.1) SRD:3 --> Calling Count:0 [2023-07-10 15:21:36] (2.3.1) TorchVersion:2.0.1+cu117+cuda11.7 --> Calling Count:1 [2023-07-10 15:21:36] (2.3.1) TransformersVersion:4.30.0 --> Calling Count:1 . . AttributeError: 'NoneType' object has no attribute 'seek'. You can only torch.load from a file that is seekable. Please pre-load the data into a buffer like io.BytesIO and try to load from it instead.

Kensvin28 commented 1 year ago

If I use the SAVE_MODEL_STATE_DICT mode, I can save the model too and infer later right? What is the difference with between SAVE_MODEL_STATE_DICT and SAVE_FULL_MODEL?

yangheng95 commented 1 year ago

Please try to save state dict which can avoid many compatible errors with different transformers versions.