Closed ForawardStar closed 2 years ago
Hi! It should work (examples for Linux systems)
# Save model to directory:
model.save_pretrained("./my_model_directory/")
# Load model from directory:
model = AutoModel.from_pretrained("./my_model_directory/")
####################### OR ###########################
# Cache model to directory:
model = AutoModel.from_pretrained("bert-base-chinese", cache_dir="./my_model_directory/")
Thanks! It is useful for me. One more question: I found that the files saved in "./my_model_directory/" contain config.json, pytorch_model.bin, and other .json files, but do not contain 'vocab.txt', is it reasonable?
I think vocab.txt
is only needed for the tokenizer. The tokenizer can also be saved in the same directory.
When you do
tokenizer = AutoTokenizer.from_pretrained("bert-base-chinese")
model_Bert = AutoModel.from_pretrained("bert-base-chinese")
this should be caching the files in your local cache, it shouldn't redownload the files every time. You shouldn't need to specify a local folder in which to save them.
How are you identifying that the files are redownloaded every time?
This issue has been automatically marked as stale because it has not had recent activity. If you think this still needs to be addressed please comment on this thread.
Please note that issues that do not follow the contributing guidelines are likely to be ignored.
this should be caching the files in your local cache, it shouldn't redownload the files every time. You shouldn't need to specify a local folder in which to save them.
model cards are updated frequently. so people re-download them a lot!
📚 Migration
Very great project! I can successfully run the code:
But, every time I run my code, the downloading processing are repeated, which is time consuming. So I wonder how to load the pre-trained models from my local directory instead of downloading the pre-trained models everytime?