Cerebras / modelzoo

Apache License 2.0
970 stars 133 forks source link

[modelzoo/transformers/t5/] Mismatch between vocabulary size set in config/params_<modelsize>.yaml and vocab size reported in $model_dir/params.yaml #6

Open ugermann opened 1 year ago

ugermann commented 1 year ago

Ideally, I should be able to run/eval/continue-training-based-on a trained model with with the parameters stored as ${model_dir}/params.yaml. However, the vocabulary size is increased (and then stored in params.yaml!) by "extra_ids" to accommodate sentinels for unsupervised T5 training. This makes it impossible to use params.yaml for downstream use of trained models.

Fix: make sure the "extra_ids" in params.yaml is taken into account when sanity-checking the vocabulary size.

bhargav-cerebras commented 1 year ago

Hi @ugermann , apologies, this issue was missed. Could you provide me an example yaml that you tried so that we can see if there's anything missing on our side. With the latest release, I am unable to reproduce this. For example, the src_vocab_size: 32001 in my input params.yaml is same as src_vocab_size: 32001 in model_dir/train/params_train.yaml so we're seeing that the src_vocab_size is not increased.