@NicolasHug's comment about the sampler benchmark variability piqued my interest. I dug into it a little bit, and I could reproduce the variability. The root issue is (as it always is) torch.manual_seed. If we make sure that each run has the same seed, the variability of the random samplers goes way down to a reasonable level. And more evidence is that changing the seed changes the actual run time. Some examples:
With seed=0:
python benchmarks/samplers/benchmark_samplers.py --num_experiments=100 --torch_seed=0
----------
num_clips = 1
clips_at_random_indices med = 31.63ms +- 1.82 med fps = 316.1
clips_at_regular_indices med = 7.33ms +- 0.24 med fps = 1364.9
clips_at_random_timestamps med = 54.58ms +- 1.48 med fps = 183.2
clips_at_regular_timestamps med = 6.85ms +- 0.61 med fps = 1460.1
----------
num_clips = 50
clips_at_random_indices med = 278.65ms +- 18.79 med fps = 1794.4
clips_at_regular_indices med = 317.70ms +- 17.65 med fps = 1573.8
clips_at_random_timestamps med = 288.29ms +- 18.48 med fps = 1734.3
clips_at_regular_timestamps med = 321.50ms +- 20.50 med fps = 1524.1
With seed=1:
python benchmarks/samplers/benchmark_samplers.py --num_experiments=100 --torch_seed=1
----------
num_clips = 1
clips_at_random_indices med = 53.15ms +- 2.19 med fps = 188.2
clips_at_regular_indices med = 7.36ms +- 0.19 med fps = 1358.2
clips_at_random_timestamps med = 29.01ms +- 0.73 med fps = 344.8
clips_at_regular_timestamps med = 6.89ms +- 1.36 med fps = 1451.6
----------
num_clips = 50
clips_at_random_indices med = 272.55ms +- 15.75 med fps = 1834.5
clips_at_regular_indices med = 316.91ms +- 16.92 med fps = 1577.8
clips_at_random_timestamps med = 274.44ms +- 16.27 med fps = 1821.9
clips_at_regular_timestamps med = 318.11ms +- 23.31 med fps = 1540.3
With seed=1234567:
python benchmarks/samplers/benchmark_samplers.py --num_experiments=100 --torch_seed=1234567
----------
num_clips = 1
clips_at_random_indices med = 40.25ms +- 2.09 med fps = 248.5
clips_at_regular_indices med = 7.49ms +- 0.67 med fps = 1335.2
clips_at_random_timestamps med = 18.19ms +- 0.49 med fps = 549.7
clips_at_regular_timestamps med = 7.43ms +- 0.86 med fps = 1345.1
----------
num_clips = 50
clips_at_random_indices med = 269.29ms +- 16.49 med fps = 1856.7
clips_at_regular_indices med = 316.23ms +- 17.47 med fps = 1581.1
clips_at_random_timestamps med = 273.58ms +- 15.58 med fps = 1827.6
clips_at_regular_timestamps med = 321.32ms +- 18.65 med fps = 1524.9
And, notably, when we don't specify a seed, we go back to the old behavior:
python benchmarks/samplers/benchmark_samplers.py --num_experiments=100
----------
num_clips = 1
clips_at_random_indices med = 24.48ms +- 15.78 med fps = 408.5
clips_at_regular_indices med = 7.34ms +- 0.32 med fps = 1362.0
clips_at_random_timestamps med = 26.85ms +- 15.46 med fps = 372.5
clips_at_regular_timestamps med = 7.11ms +- 0.87 med fps = 1407.3
----------
num_clips = 50
clips_at_random_indices med = 277.01ms +- 17.38 med fps = 1805.0
clips_at_regular_indices med = 322.99ms +- 23.39 med fps = 1548.1
clips_at_random_timestamps med = 280.22ms +- 16.47 med fps = 1784.3
clips_at_regular_timestamps med = 318.40ms +- 16.21 med fps = 1539.0
This means that by default, our benchmarks are more like a training job that feeds different random points to sample on each iteration.
I also added the ability to specify the number of iterations as an argument because it's just convenient.
@NicolasHug's comment about the sampler benchmark variability piqued my interest. I dug into it a little bit, and I could reproduce the variability. The root issue is (as it always is) torch.manual_seed. If we make sure that each run has the same seed, the variability of the random samplers goes way down to a reasonable level. And more evidence is that changing the seed changes the actual run time. Some examples:
With seed=0:
With seed=1:
With seed=1234567:
And, notably, when we don't specify a seed, we go back to the old behavior:
This means that by default, our benchmarks are more like a training job that feeds different random points to sample on each iteration.
I also added the ability to specify the number of iterations as an argument because it's just convenient.