Open ugermann opened 1 year ago
Hi @ugermann , apologies, this issue was missed. Could you provide me an example yaml that you tried so that we can see if there's anything missing on our side. With the latest release, I am unable to reproduce this. For example, the src_vocab_size: 32001
in my input params.yaml
is same as src_vocab_size: 32001
in model_dir/train/params_train.yaml
so we're seeing that the src_vocab_size
is not increased.
Ideally, I should be able to run/eval/continue-training-based-on a trained model with with the parameters stored as ${model_dir}/params.yaml. However, the vocabulary size is increased (and then stored in params.yaml!) by "extra_ids" to accommodate sentinels for unsupervised T5 training. This makes it impossible to use params.yaml for downstream use of trained models.
Fix: make sure the "extra_ids" in params.yaml is taken into account when sanity-checking the vocabulary size.