ROCm / ROCm-OpenCL-Runtime

ROCm OpenOpenCL Runtime
170 stars 60 forks source link

Does't support subgroup? #160

Open muzhailong opened 1 year ago

muzhailong commented 1 year ago

Now! amd opencl support subgroup?

gentype sub_group_broadcast( gentype value, uint index )

gentype sub_group_reduce_add( gentype value )
gentype sub_group_reduce_min( gentype value )
gentype sub_group_reduce_max( gentype value )

gentype sub_group_scan_inclusive_add( gentype value )
gentype sub_group_scan_inclusive_min( gentype value )
gentype sub_group_scan_inclusive_max( gentype value )

gentype sub_group_scan_exclusive_add( gentype value )
gentype sub_group_scan_exclusive_min( gentype value )
gentype sub_group_scan_exclusive_max( gentype value )

// These functions are available to devices supporting
// cl_khr_subgroup_non_uniform_vote:

int sub_group_elect()

int sub_group_non_uniform_all( int predicate )
int sub_group_non_uniform_any( int predicate )
int sub_group_non_uniform_all_equal( gentype value )

// These functions are available to devices supporting
// cl_khr_subgroup_ballot:

gentype sub_group_non_uniform_broadcast( gentype value, uint index )
gentype sub_group_broadcast_first( gentype value )

uint4 sub_group_ballot( int predicate )
int   sub_group_inverse_ballot( uint4 value )
int   sub_group_ballot_bit_extract( uint4 value, uint index )
uint  sub_group_ballot_bit_count( uint4 value )
uint  sub_group_ballot_inclusive_scan( uint4 value )
uint  sub_group_ballot_exclusive_scan( uint4 value )
uint  sub_group_ballot_find_lsb( uint4 value )
uint  sub_group_ballot_find_msb( uint4 value )

uint4 get_sub_group_eq_mask()
uint4 get_sub_group_ge_mask()
uint4 get_sub_group_gt_mask()
uint4 get_sub_group_le_mask()
uint4 get_sub_group_lt_mask()