- Pad batch dimension. - Githubissues

This prevents a crash on TPU from occuring when the final chunk of the dev set doesn't fit into your batch.

This is very wonky. I am pretty sure (but not <90% certain) that now I am doing a pass over the Dev set once every 50 training steps. Essentially, in librispeech_ctc.py, eval_steps_per_loop is 5, and the batch size is 96. The number of TPU's is 8. Multiplying those together, you get: 3840, which is greater than 2703, the size of the dev set. I'm not 100% sure that this is correct, though.

Change the job type to "executor_tpu". This means that the eval and train jobs will both run on the TPU. We probably want this long-term, since CPU instances are not free for us, and we rarely evaluate the dev set anyway.
Change learning_rate to 1e-4 based on Anajali's experiments.

Currently running an experiment here:

gs://the-peoples-speech-west-europe/training_logs/galvez/tpu_ctc_2h

galv / lingvo-copy

- Pad batch dimension. #5