Open saggu opened 2 years ago
Hi,
I am trying to finetune the v2 version of the base model, downloaded from here&prefix=&forceOnObjectsSortingFiltering=false).
This is the command I am using,
t5_mesh_transformer \ --model_dir="/tmp/model_out" \ --gin_param="utils.run.mesh_devices = ['gpu:0','gpu:1']" \ --gin_param="utils.run.train_dataset_fn = @t5.models.mesh_transformer.tsv_dataset_fn" \ --gin_param="utils.run.mesh_shape = 'model:1,batch:2'" \ --gin_param="tsv_dataset_fn.filename = 'train.tsv'" \ --gin_file="operative_config.gin" \ --gin_param="run.train_steps = 1260900"
I am using the the initial checkpoint as (in the operative_config.gin) file ,
init_checkpoint =/path/to/downloaded/v2/base/model.ckpt-1250900'`
init_checkpoint =
and I want to train for 10000 epochs, hence --gin_param="run.train_steps = 1260900"
--gin_param="run.train_steps = 1260900"
Given the task, is this the right setup?
I am seeing poor performance, max 8% accuracy after the training step.
I am attaching the train.tsv and the operative_config.gin files.
train.tsv
operative_config.gin
Archive.zip
Any help is appreciated. Thanks
Hi,
I am trying to finetune the v2 version of the base model, downloaded from here&prefix=&forceOnObjectsSortingFiltering=false).
This is the command I am using,
I am using the the initial checkpoint as (in the operative_config.gin) file ,
init_checkpoint =
/path/to/downloaded/v2/base/model.ckpt-1250900'`and I want to train for 10000 epochs, hence
--gin_param="run.train_steps = 1260900"
Given the task, is this the right setup?
I am seeing poor performance, max 8% accuracy after the training step.
I am attaching the
train.tsv
and theoperative_config.gin
files.Archive.zip
Any help is appreciated. Thanks