kimiyoung / transformer-xl

Apache License 2.0
3.6k stars 762 forks source link

question on TRAIN_BSZ used in tf/scripts/text8_large_tpu.sh #107

Open lelouchmatlab opened 4 years ago

lelouchmatlab commented 4 years ago

TRAIN_BSZ=64 is used in text8_large_tpu.sh.

During training data preparation it is used as "--per_host_train_bsz=${TRAIN_BSZ}" During training it is used as "--train_batch_size=${TRAIN_BSZ}" and when calling data_utils.get_input_fn() it is used as "per_host_bsz=FLAGS.train_batch_size // FLAGS.num_hosts,"

Maybe I missed something so get a bit confused on this, wonder if anyone could explain this a bit?