Open JablonskiMateusz opened 4 years ago
@jszuppe could you look at this PR?
Hey @JablonskiMateusz! Could you provide some detail on the motivation for this change? Was this causing issues with your OpenCL library?
For context, device
is a very thin wrapper on cl_device_id
, and they have identical memory layouts. There should be no need to first populate vector<cl_device_id>
and then copy into vector<device>
.
Hi @kylelutz,
the problem is scenario with sub devices. The current implementation is that boost queries number of context devices, then creates a vector of compute::device
(default constructor of compute::device
is called, without calling clRetainDevice), then it passes the pointer to data of the vector to clGetContextInfo and it fills the devices (also without calling clRetainDevice) and the vector is passed to the user's app. When the app destroys the vector, then the destructor of compute::device
checks if it is a sub device and it calls clReleaseDevice. This causes that boost calls clReleaseDevice without calling clRetainDevice before. According to OpenCL spec, it caues UB:
After the device reference count becomes zero and all the objects attached to device (such as
command-queues) are released, the device object is deleted. Using this function to release a
reference that was not obtained by creating the object or by calling clRetainDevice causes
undefined behavior.
IMHO the cl_device_id
should be passed to compute::device
to its constructor or to assignment operator, currently it is performed in a quite hacky way like memcpy to the memory of compute::device
.
My change was inspired by compute::program::get_devices
where the vector of compute::device
is populated from vector of cl_device_id
https://github.com/boostorg/compute/blob/master/include/boost/compute/program.hpp#L191
@kylelutz @jszuppe any comments?
I think that in a real application the compute::context::get_devices
should not be the critical path, so having a more correct version does not seem like a problem.
@kylelutz ping
@kylelutz @jszuppe ping
kindly reminder @kylelutz
@kylelutz @jszuppe could you merge this PR?
get a vector of cl_device_id and then use the device ids to populate a vector of compute::device
Signed-off-by: Mateusz Jablonski mateusz.jablonski@intel.com