google / maxtext

A simple, performant and scalable Jax LLM!
Apache License 2.0
1.45k stars 271 forks source link

[MLPerf][GPT3] Bypass setting eval_interval in using synthetic dataset #823

Closed ZhiyuLi-goog closed 1 month ago

ZhiyuLi-goog commented 1 month ago

There's no eval_iterator in dataset_type=synthetic which is also not necessary. https://github.com/google/maxtext/blob/77f079f845084ba853453dc0755deb0daf312e26/MaxText/input_pipeline/input_pipeline_interface.py#L144

This change is to bypass setting eval_interval with dataset_type=synthetic, avoid the error in using an None evaluation iterator for eval step. https://github.com/google/maxtext/blob/bdc4d8d6d4ab767d2c3ee52dbb465278111f2be9/MaxText/train.py#L567