argonne-lcf / dlio_benchmark

An I/O benchmark for deep Learning applications
https://dlio-benchmark.readthedocs.io
Apache License 2.0
65 stars 30 forks source link

fix last step is not executed #236

Closed rayandrew closed 3 weeks ago

rayandrew commented 3 weeks ago

If we have 730 steps, DLIO benchmark only executes until 729

The bug also persists when user specified total_training_steps

Fix: #235

rayandrew commented 3 weeks ago

There is one bug. if total_training_steps is not specified, the default will be -1. I added check for that as well

rayandrew commented 3 weeks ago

I think the last commit fixed the CI @hariharan-devarajan

hariharan-devarajan commented 3 weeks ago

@zhenghh04 This is ready for merge as well.