Closed alphaGem closed 2 years ago
However, it seems that if I change the vocab size in my local cached config to 50258, the model doesn't work correctly because outputs[:,:,-1]
are all zeros, which are significantly larger than all other values. Using slice outputs[:,:,:-1]
as the real output seems to solve the problem.
This is caused by the new release ,you can pull the latest code and and it would work like you expect.Also make sure you delete the .cache/model_center/gpt2-base/ dir because we updated the config json on cloud too.
I have tried the following actions respectively:
pip uninstall model-center
and then pip install model-center
pip uninstall model-center
; then clone the latest code and run python3 setup.py install
in the code folderBefore each time I try to run my code, I delete the ~/.cache/model_center
folder.
However, none of the above actions solves the problem.
Are you sure that the pre-trained gpt-2 base model on cloud (the download path in utils/net_utils.py
is https://openbmb.oss-cn-hongkong.aliyuncs.com/model_center/{path}
as far as I can see) has a correct vocab size of 50257 instead of 50258?
I have tried the following actions respectively:
pip uninstall model-center
and thenpip install model-center
pip uninstall model-center
; then clone the latest code and runpython3 setup.py install
in the code folderBefore each time I try to run my code, I delete the
~/.cache/model_center
folder.However, none of the above actions solves the problem.
Are you sure that the pre-trained gpt-2 base model on cloud (the download path in
utils/net_utils.py
ishttps://openbmb.oss-cn-hongkong.aliyuncs.com/model_center/{path}
as far as I can see) has a correct vocab size of 50257 instead of 50258?
Sorry, We didn't update the checkpoint on the cloud before,which is not compatible with the config json.The vocab size should be 50257 ,and the checkpoint before has a extra dim with all zeros.Now the issue is fixed, you can clean the .cache/checkpoint and redownload the correct checkpoint by using from_pretrained method.
This issue has been fixed
Describe the bug
Minimal steps to reproduce
Expected behavior
Successfully loads the model.
Environment:
model-center 0.1.3, torch 0.11.0, cuda 10.2
Additional information If I change the vocab size in my local cached config to 50258, it seems to load correctly.