Closed mohammedayub44 closed 4 years ago
You need to pass save_model a Path to the directory you wish to save it to.
@aaronbriel I did try passing path to the folder and I get the same error.
Is it compulsory to pass the path ? If I don't pass any path to this function (as shown in sample notebook), I see that its picking up the "OUTPUTDIR" location (maybe from the learner object) and automatically creates "model_output" folder..it writes the following 4 files -
pytorch_model.bin
special_tokens_map.json
config.json
tokenizer_config.json
But fails to create vocab.txt
as mentioned in documentation. Do you see the same issue on your end ?
Thanks !
I am not seeing this issue. What is the actual error?
It just shows KeyError
, which is strange.
I verified my tokenizer name , model name etc. everything works fine except saving the model.
@aaronbriel Here is the Dropbox link which contains the Python Notebook file along with data and labels folder. (training for only one epoch) If you could let me if this runs and saves model successfully on your machine. (or if any errors)
Appreciate the help!
I also faced same issue trying out example @ https://github.com/kaushaltrivedi/fast-bert/blob/master/test/multi_class.ipynb. TypeError is happening while saving tokenizer.
def save_model(self, path=None):
if not path:
path = self.output_dir/'model_out'
path.mkdir(exist_ok=True)
torch.cuda.empty_cache()
# Save a trained model
model_to_save = self.model.module if hasattr(self.model, 'module') else self.model # Only save the model it-self
model_to_save.save_pretrained(path)
# save the tokenizer
self.data.tokenizer.save_pretrained(path)
Note: Issue is with bert, roberta model. its working fine for xlnet. Please let us know.. Thanks a ton..
same to me, I also have this issue.
Hi Aaron, Please check below trace.
Sorry about the delay. I was indeed able to replicate the error with my current implementation using the latest fast-bert:
File "/home/bert/.venv/lib/python3.6/site-packages/fast_bert/learner_util.py", line 128, in save_model self.data.tokenizer.save_pretrained(path) File "/home/bert/.venv/lib/python3.6/site-packages/transformers/tokenization_utils.py", line 605, in save_pretrained vocab_files = self.save_vocabulary(save_directory) File "/home/bert/.venv/lib/python3.6/site-packages/transformers/tokenization_utils.py", line 1994, in save_vocabulary files = self._tokenizer.save(save_directory) File "/home/bert/.venv/lib/python3.6/site-packages/tokenizers/implementations/base_tokenizer.py", line 222, in save return self._tokenizer.model.save(directory, name=name) TypeError
So can we have any way to work around of this issue? Or We could use an old version?
I wasn't seeing this before the last build, so you could try with that. I was going to look into it more tomorrow.
Cool, thank you very much!
@aaronbriel sorry for late reply. Thanks for taking a look into it. Looking forward to your solution.
FYI, fixed in https://github.com/kaushaltrivedi/fast-bert/pull/205
Thanks @aaronbriel Will check it out. Closing this for now.
I'm still facing this same issue. Even with the latest fix.
---------------------------------------------------------------------------
TypeError Traceback (most recent call last)
<ipython-input-80-3ed6ed15c468> in <module>
----> 1 learner.save_model()
~/anaconda3/envs/pytorch_p36/lib/python3.6/site-packages/fast_bert/learner_util.py in save_model(self, path)
126 self.model.module if hasattr(self.model, "module") else self.model
127 ) # Only save the model it-self
--> 128 model_to_save.save_pretrained(path)
129
130 # save the tokenizer
~/anaconda3/envs/pytorch_p36/lib/python3.6/site-packages/transformers/tokenization_utils.py in save_pretrained(self, save_directory)
603 f.write(out_str)
604
--> 605 vocab_files = self.save_vocabulary(save_directory)
606
607 return vocab_files + (special_tokens_map_file, added_tokens_file)
~/anaconda3/envs/pytorch_p36/lib/python3.6/site-packages/transformers/tokenization_utils.py in save_vocabulary(self, save_directory)
1992 def save_vocabulary(self, save_directory):
1993 if os.path.isdir(save_directory):
-> 1994 files = self._tokenizer.save(save_directory)
1995 else:
1996 folder, file = os.path.split(os.path.abspath(save_directory))
~/anaconda3/envs/pytorch_p36/lib/python3.6/site-packages/tokenizers/implementations/base_tokenizer.py in save(self, directory, name)
220 The name of the tokenizer, to be used in the saved files
221 """
--> 222 return self._tokenizer.model.save(directory, name=name)
TypeError:
@kaushaltrivedi did you publish? The code is in place but fast-bert has not yet updated in pypi.
@sidPN you can use my fork until the fast-bert library is updated (git://github.com/aaronbriel/fast-bert@master#egg=fast_bert) or just download it locally.
I'm trying to run the multilabel classification model and while saving the model it give me an error on vocab file
learner.save_model()
gives below error:Is this because I have not specified some path or because I'm not using a pretrained model path from local as in sample notebook.
My learner config is as below:
DataBunchConfig as below:
Any help appreciated. Thanks!