vetter / shoc

The SHOC Benchmark Suite
Other
243 stars 104 forks source link

"nopinned" option in OpenCL mode may be unset, default true all the time caused too high bandwidth #83

Open lf2050 opened 1 year ago

lf2050 commented 1 year ago

src/opencl/level0/BusSpeedDownload.cpp : bool pinned = !op.getOptionBool("nopinned");

I use OpenCL test mode.:

./configue --with-opencl --without-cuda --prefix=$xxx make install ./bin/shocdriver -opencl -benchmark BusSpeedDownload

The bandwidth result is too high. I review the code, find out that: "nopinned" option is defaultly true, so CL_MEM_ALLOC_HOST_PTR is used while creat cl buffer.

That will cause an issue: the final time result is not the time that data transfer from host to device, but transfer time from one device memory to another device memory. that is not pcie bandwidth, but graphic ddr bandwidth.

In OpenCL spec 2.0, use CL_MEM_ALLOC_HOST_PTR will return an buffer, already on device, mapped to host.

In this case, If we want pcie bandwidth, "nopinned" must be set, "bool pinned = false;"will work