mlcommons / algorithmic-efficiency

MLCommons Algorithmic Efficiency is a benchmark and competition measuring neural network training speedups due to algorithmic improvements in both training algorithms and models.
https://mlcommons.org/en/groups/research-algorithms/
Apache License 2.0
321 stars 62 forks source link

Add flag to completely opt out of checkpointing #705

Closed priyakasimbeg closed 5 months ago

priyakasimbeg commented 5 months ago

Description

Currently --save_checkpoints flag is only used to control saving intermediate checkpoints. Our checkpointing code currently is not compatible with all possible submissions, so we will need a flag to completely disable checkpointing.

We should also clarify in the technical documentation that during scoring we will not use checkpoints.