The smaller_final_batch_mode parameter is ignored during training as the dataset is "infinite" (samples yielded by a generator). Dropping a smaller batch won't happen. During inference, users will rarely want to specify if they want to specify whether they want to use pad or dynamic.
Should this parameter be completely removed from training config and hardcoded to pad for inference?
The
smaller_final_batch_mode
parameter is ignored during training as the dataset is "infinite" (samples yielded by a generator). Dropping a smaller batch won't happen. During inference, users will rarely want to specify if they want to specify whether they want to usepad
ordynamic
.Should this parameter be completely removed from training config and hardcoded to
pad
for inference?@wyli