Closed guruofquality closed 11 years ago
dual quad core opteron - good - http://i.imgur.com/r2ihYwe.png 8 core i7 - bad - http://i.imgur.com/lp0RN1M.png
turns out we were unfairly handicapping gras with small chunk sizes. The GR chunk sizes is 32kiB by default, and GRAS was defaulting to 16kiB. The i7 happened to be so fast that scheduler overhead was dominating the work overhead, since 1/2 chunk size means double the scheduler overhead. Making GRAS the same chunksize basically addresses this. Close bug when new plots uploaded.
Still not 100% happy, but the chunksize was the real issue for this bug. I will make a new one for general benchmark improvements.
this could be due to bad scheduling since we are spinning...
The benchmarks do improve when I numactl --physcpubind="0,1,2,3"