AI-Hypercomputer / maxtext

A simple, performant and scalable Jax LLM!
Apache License 2.0
1.47k stars 275 forks source link

Add block_until_ready operation before checkpoint saving operation. #871

Closed abhinavclemson closed 2 weeks ago

abhinavclemson commented 2 weeks ago

Add block_until_ready operation before checkpoint saving operation in order to capature accurate checkpoint blocking time on orbax side.