This PR shifts all GPU memory computation from the end of each experiment to the end of the benchmarking script. This avoids the need to rerun experiments, instead the raw values are saved and the aggregated values are computed at the end across all the experiments in gather_report.
Description
This PR shifts all GPU memory computation from the end of each experiment to the end of the benchmarking script. This avoids the need to rerun experiments, instead the raw values are saved and the aggregated values are computed at the end across all the experiments in
gather_report
.