Open LWisteria opened 6 years ago
This needs overload function for various types, so #16 is needed to resolve this issue.
The policy may be like below:
//In clpy/carray.clh
#ifdef __ULTIMA
static void __clpy_begin_print_out() __attribute__((annotate("clpy_begin_print_out")));
// atomic operation
static void atomicAdd(__global float* x, float y)
{
//This function is already implemented.
}
// TODO(yoriyuki.kitta): Add another types implementation
//Add atomicAdd for other types here.
//atomicAdd functions are allowed to have same name.
//Ultima changes overloaded functions' names nicely ;).
static void __clpy_end_print_out() __attribute__((annotate("clpy_end_print_out")));
#endif //__ULTIMA
64-bit atomic operations need cl_khr_int64_base_atomics
extension.
clinfo
says NVIDIA driver doesn't implement this.
And I couldn't find how to implement 64-bit atomicAdd
s without it...
I'm working on this issue with the policy of supporting only 32-bit operations.
Note for developers:
When you implement 64-bit integer atomicAdd
,
please test it by adding numpy.uint64
into test_ndarray_scatter.py test target.
@nsakabe-fixstars do you mean reverting bbdec438d3d11f0ced6de0e8853f4e1c0d652774 ?
Yeah. Revert bbdec43 and remove corresponding TODO notes.
Currently there is only atomicAdd for fp32, to pass Chainer's example (e.g. ptb).