Now for 3D-UNet SingleStream scenario, we are missing the equal issue mode support from LoadGen, and it is problematic as below:
3D-UNet KiTS19 input sets has 42 samples, with 15 different shapes; total voxel count of each sample ranges from 7.8 millions to 64 millions.
First 1050 samples using RNG seed for v2.0 submission produces sequence of samples whose total voxel count to be 33,341,833,216.
With equal issue mode, first 1050 samples give total voxel count of 34,000,076,800.
If one produces logs satisfying the min_query_count=1024 for SingleStream, the work is about 2% less than it should be, and the performance metric (90% latency) would be optimistic as such.
Next round, with different RNG seed, the measurement may be flipped into pessimistic side.
Overall, unless equal issue mode is introduced, there will be opposing 'official results' round by round, even with the same system (HW&SW) running exact the same 3D-UNet SingleStream.
Without equal issue mode, early stopping backbone (statistical model) is broken, limiting the usefulness of early stopping model.
It is important for LoadGen to support the equal issue mode, for scenarios other than Offline.
For v2.0, we probably want to add the support to 3D-UNet SingleStream specifically, and after the submission we can make the implementation to be more generic and unified.
We have an equal issue mode implementation for Offline scenario, through PR1032 (https://github.com/mlcommons/inference/pull/1032), and this works flawlessly for 3D-UNet Offline runs.
Now for 3D-UNet SingleStream scenario, we are missing the equal issue mode support from LoadGen, and it is problematic as below:
It is important for LoadGen to support the equal issue mode, for scenarios other than Offline. For v2.0, we probably want to add the support to 3D-UNet SingleStream specifically, and after the submission we can make the implementation to be more generic and unified.