fixstars / clpy

OpenCL backend for CuPy
Other
152 stars 13 forks source link

atomicAdd operation for 64-bit integer types (Complete test_ndarray_scatter.py cases) #30

Open LWisteria opened 6 years ago

LWisteria commented 6 years ago

Currently there is only atomicAdd for fp32, to pass Chainer's example (e.g. ptb).

LWisteria commented 6 years ago

This needs overload function for various types, so #16 is needed to resolve this issue.

vorj commented 5 years ago

The policy may be like below:

//In clpy/carray.clh

#ifdef __ULTIMA
static void __clpy_begin_print_out() __attribute__((annotate("clpy_begin_print_out")));

// atomic operation
static void atomicAdd(__global float* x, float y)
{
  //This function is already implemented.
}
// TODO(yoriyuki.kitta): Add another types implementation

//Add atomicAdd for other types here.
//atomicAdd functions are allowed to have same name.
//Ultima changes overloaded functions' names nicely ;).

static void __clpy_end_print_out() __attribute__((annotate("clpy_end_print_out")));
#endif //__ULTIMA
nsakabe-fixstars commented 5 years ago

64-bit atomic operations need cl_khr_int64_base_atomics extension. clinfo says NVIDIA driver doesn't implement this.

And I couldn't find how to implement 64-bit atomicAdds without it...

nsakabe-fixstars commented 5 years ago

I'm working on this issue with the policy of supporting only 32-bit operations.

nsakabe-fixstars commented 5 years ago

Note for developers:

When you implement 64-bit integer atomicAdd, please test it by adding numpy.uint64 into test_ndarray_scatter.py test target.

LWisteria commented 5 years ago

@nsakabe-fixstars do you mean reverting bbdec438d3d11f0ced6de0e8853f4e1c0d652774 ?

nsakabe-fixstars commented 5 years ago

Yeah. Revert bbdec43 and remove corresponding TODO notes.