google-research / google-research

Google Research
https://research.google
Apache License 2.0
34.23k stars 7.9k forks source link

Complete gin files for finetuning UL2 Unifying Language Learning Paradigms #1101

Open yangdong02 opened 2 years ago

yangdong02 commented 2 years ago

Hi! I am wondering what gin files are needed to fine-tune UL2. I tried the following gin file, adapted from https://github.com/google-research/t5x/blob/main/t5x/examples/t5/t5_1_1/examples/small_wmt_finetune.gin. However, I met the problem of NameError: 't5_architecture' was not provided by an import statement.. I guess we will also need some gin files from FlaxFormer, right? Could you provide the complete gin file list for fine-tuning UL2? Thanks!

from __gin__ import dynamic_registration

import __main__ as train_script
from t5.data import mixtures
from t5x import models
from t5x import partitioning
from t5x import utils

include "ul2.gin"
include "t5x/configs/runs/finetune.gin"

USE_CACHED_TASKS = False
MIXTURE_OR_TASK_NAME = "wmt_t2t_ende_v003"
TASK_FEATURE_LENGTHS = {"inputs": 256, "targets": 256}
TRAIN_STEPS = 2651000
DROPOUT_RATE = 0.0
INITIAL_CHECKPOINT_PATH = "gs://scenic-bucket/ul2/ul220b/checkpoint_2650000/"
# `LOSS_NORMALIZING_FACTOR`: When fine-tuning a model that was pre-trained
# using Mesh Tensorflow (e.g. the public T5 / mT5 / ByT5 models), this should be
# set to `pretraining batch_size` * `target_token_length`. For T5 and T5.1.1:
# `2048 * 114`. For mT5: `1024 * 229`. For ByT5: `1024 * 189`.
LOSS_NORMALIZING_FACTOR = 233472
kamalkraj commented 2 years ago

It will be great if we can have the complete gin config.

is the base model config t5 or t5_1_1 ? @vanzytay

yhavinga commented 2 years ago

@kamalkraj there is one file (May '22) available at archive.org on the gsbucket url: https://web.archive.org/web/20220509043347/https://storage.googleapis.com/scenic-bucket/ul2/ul220b/config.gin