Fail to load model - Githubissues

pepi99 commented 2 years ago

I trained a model and transferred the checkpoint to another project, where I installed pyabsa with pip (so pyabsa is library in my venv).

When I try to load the model, I get an error. Here is my code:

from pyabsa import APCCheckpointManager

checkpoint_name = '/Users/petar.ulev/Documents/absa/checkpoints/pyabsa_checkpoints/fast_lsa_t_Crypto_acc_79.74_f1_80.21' sent_classifier = APCCheckpointManager.get_sentiment_classifier(checkpoint=checkpoint_name)

Error: Traceback (most recent call last): File "/Users/petar.ulev/Documents/absa/venv/lib/python3.8/site-packages/pyabsa/core/apc/prediction/sentiment_classifier.py", line 71, in init self.model = torch.load(model_path, map_location='cpu') File "/Users/petar.ulev/Documents/absa/venv/lib/python3.8/site-packages/torch/serialization.py", line 712, in load return _load(opened_zipfile, map_location, pickle_module, **pickle_load_args) File "/Users/petar.ulev/Documents/absa/venv/lib/python3.8/site-packages/torch/serialization.py", line 1046, in _load result = unpickler.load() File "/Users/petar.ulev/Documents/absa/venv/lib/python3.8/site-packages/transformers/models/deberta_v2/tokenization_deberta_v2.py", line 324, in setstate self.spm.Load(self.vocab_file) File "/Users/petar.ulev/Documents/absa/venv/lib/python3.8/site-packages/sentencepiece/init.py", line 367, in Load return self.LoadFromFile(model_file) File "/Users/petar.ulev/Documents/absa/venv/lib/python3.8/site-packages/sentencepiece/init.py", line 171, in LoadFromFile return _sentencepiece.SentencePieceProcessor_LoadFromFile(self, arg) OSError: Not found: "/home/ubuntu/.cache/huggingface/transformers/6386fc34376768db39488179803c16268ff12ee177a43a993690f66b7d7a0b7c.0abaeacf7287ee8ba758fec15ddfb4bb6c697bb1a8db272725f8aa633501787a": No such file or directory Error #2

During handling of the above exception, another exception occurred:

Traceback (most recent call last): File "/Users/petar.ulev/Documents/absa/scripts/pyabsa_scripts/pyasba_main.py", line 5, in File "/Users/petar.ulev/Documents/absa/venv/lib/python3.8/site-packages/pyabsa/utils/pyabsa_utils.py", line 173, in decorated return f(*args, **kwargs) File "/Users/petar.ulev/Documents/absa/venv/lib/python3.8/site-packages/pyabsa/functional/checkpoint/checkpoint_manager.py", line 71, in get_sentiment_classifier sent_classifier = SentimentClassifier(checkpoint, sentiment_map=sentiment_map, eval_batch_size=eval_batch_size) File "/Users/petar.ulev/Documents/absa/venv/lib/python3.8/site-packages/pyabsa/core/apc/prediction/sentiment_classifier.py", line 101, in init raise RuntimeError('Fail to load the model from {}! \nException: {} '.format(e, model_arg)) RuntimeError: Fail to load the model from Not found: "/home/ubuntu/.cache/huggingface/transformers/6386fc34376768db39488179803c16268ff12ee177a43a993690f66b7d7a0b7c.0abaeacf7287ee8ba758fec15ddfb4bb6c697bb1a8db272725f8aa633501787a": No such file or directory Error #2! Exception: /Users/petar.ulev/Documents/absa/checkpoints/pyabsa_checkpoints/fast_lsa_t_Crypto_acc_79.74_f1_80.21

yangheng95 commented 2 years ago

For this problem you can try save the .state_dict insead of .model, ie, save_mode=1

pepi99 commented 2 years ago

So, again train my model, but with save_model=1? What is the difference?

But I don't understand, if I load the model in the pyabsa demos folder, it works. If I extract it to my specific project, it does not work.

yangheng95 commented 2 years ago

Yes, retrain it. I suppose this problem is triggered in the cache function of the transformers package, which saved the training environ-dependent cache info. Save as .state_dict will not save the cache file info.

pepi99 commented 2 years ago

Will there be performance difference between save_model=1 and save_model=2, when I load it ?

yangheng95 commented 2 years ago

No evidence shows a difference of performance

pepi99 commented 2 years ago

So basically save_model=2 is unusable, if it gives error?

pepi99 commented 2 years ago

Can I just transfer the trainin environ-dependend cache info? How can I do it?

yangheng95 commented 2 years ago

If you need to transfer checkpoint, please use save_mode=1. The cache function involves filename hash, I have not idea how to transfer this info.

pepi99 commented 2 years ago

Ok, I understand. Thank you. I will try now, the model will train overnight and I will let you know if everything works okay tomorrow.

pepi99 commented 2 years ago

Ok, it works now. There is slight difference in test performance (but this is, I suppose, for training reasons). Thanks.

yangheng95 / PyABSA

Fail to load model #145