Closed danyaljj closed 4 years ago
Hi, you ought to pass in a
--model_dir="${MODEL_DIR}" \
and make sure that there is a checkpoint in there that does not disagree with the pretrained model (small).
(it looks like it is defaulting to
/tmp/transformer_standalone/model.ckpt-0
which I am guessing was created from a previous run which was not with the "small" model, maybe?)
I see. Just to make sure, is the following a valid syntax for referencing the model-dir?
--model_dir="gs://t5-data/pretrained_models/small/"
Also, what's the intent behind the following line?
--gin_file="gs://t5-data/pretrained_models/small/operative_config.gin"
I see. Just to make sure, is the following a valid syntax for referencing the model-dir?
Not quite, you want to choose a model dir which you have write access to.
Also, what's the intent behind the following line?
That loads all of the configuration for the small model (including the pre-trained checkpoint location).
When running the following command for fine-tuning:
I am getting the following error:
Here is the full error log:
For some reason when I drop the last line
--gin_file="gs://t5-data/pretrained_models/small/operative_config.gin"
it works fine; which is surprising since I was under the impression that this line determines the pre-trained model to use (small, base, large, etc).Additional info: I am running it on a GPU machine, but it shouldn't be a problem since the error happens when loading the models (and before any computation).