Feature request: Checkpoint labels for CM builds

WarrenSchultz commented 1 month ago

Given the highly dynamic nature of ML testing development, it would be an extremely useful feature to be able to pull labels for guaranteed repeatability across multiple systems, or to be able to go back and replicate older test conditions with different hardware at a later date. (If this is already in there, then the request is for more detailed documentation on how to do so :)

arjunsuresh commented 4 weeks ago

We'll surely do more labels and releases starting next month once we have covered all of the Nvidia and Intel MLPerf implementations in CM. But even now, we do support git checkout with CM. This is shown in our 4.0 submission README

Further CM for MLPerf inference is also generating a version information file that shows the versions of the dependencies in use. After the submissions, we have also added git commit hash for the repositories where non-release branches are used. https://github.com/mlcommons/inference_results_v4.0/blob/main/closed/CTuning/measurements/GATE_Overflow_Intel_Sapphire_Rapids-nvidia_original-gpu-tensorrt-vdefault-default_config/resnet50/offline/cm-version-info.json

Please let us know if you have any feedback or further requirements.

WarrenSchultz commented 4 weeks ago

Thanks, let me take a look at this.

mlcommons / cm4mlops

Feature request: Checkpoint labels for CM builds #63