Closed ag1988 closed 3 years ago
Hi there 👋
train_batch_size=64
and 100k steps "narrativeqa",
"ai2_science_middle", "ai2_science_elementary",
"arc_hard", "arc_easy",
"mctest_corrected_the_separator",
"squad1_1", "squad2",
"boolq",
"race_string",
"openbookqa",
This was very helpful! Thank you so much for the prompt response!
Regards, Ankit
On Sun, Mar 28, 2021 at 3:53 PM Daniel Khashabi @.***> wrote:
Closed #15 https://github.com/allenai/unifiedqa/issues/15.
— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub https://github.com/allenai/unifiedqa/issues/15#event-4518484089, or unsubscribe https://github.com/notifications/unsubscribe-auth/AID3IGDJVCN5PAT5AFWLQS3TF6CM7ANCNFSM4Z5HCF2A .
Hey Daniel, btw did you also use batch size 64 for the base and small models?
The middle numbers here are the batch sizes:
# model_parallelism, train_batch_size, keep_checkpoint_max for different model sizes.
HYPERPARAMETER_DICT = {
"small": (1, 256, 40),
"base": (2, 128, 28),
"large": (8, 64, 24),
"3B": (8, 16, 30),
"11B": (8, 8, 30)
}
Thanks again Daniel :-)
Hi, thank you for sharing your wonderful work and models. I am trying to reproduce your training from T5 using the preprocessed datasets that you've provided. In the paper you say you finetuned t5-11b using batch size 8 and 100k steps. I dont have the compute for this so I am only finetuning t5-large. I have the following questions:
Thank you, Ankit