I tested chunk sizes 8 and 4, and found them to be much slower than with chunk size 16. A minor tweak to the way that the number of workgroups is calculated for the SMVP shader is needed for these tests to be run. I manually changed the chunk_size variable and ran the benchmark suite to get the following results.
Chunk size: 8
MSM size
1st run
Run 1
Run 2
Run 3
Run 4
Run 5
Average (incl 1st)
Average (excl 1st)
2^16
24303
2052
2075
1952
2142
2011
5756
2046
2^17
3201
3310
3164
3193
3300
3164
3222
3226
2^18
6669
5695
5565
5689
5712
5703
5839
5673
2^19
11392
10419
10368
10404
10361
10372
10553
10385
2^20
20680
19639
19660
19672
19757
19666
19846
19679
Chunk size: 4
2 16: 23820ms
2 17: 45787ms
The results are clear, so I didn't run more benchmarks
I tested chunk sizes 8 and 4, and found them to be much slower than with chunk size 16. A minor tweak to the way that the number of workgroups is calculated for the SMVP shader is needed for these tests to be run. I manually changed the
chunk_size
variable and ran the benchmark suite to get the following results.Chunk size: 8
24303
2052
2075
1952
2142
2011
5756
2046
3201
3310
3164
3193
3300
3164
3222
3226
6669
5695
5565
5689
5712
5703
5839
5673
11392
10419
10368
10404
10361
10372
10553
10385
20680
19639
19660
19672
19757
19666
19846
19679
Chunk size: 4
2 16: 23820ms 2 17: 45787ms
The results are clear, so I didn't run more benchmarks