I wonder if we should consider a version of append_to_global_memory() where each thread may have its data elsewhere (at an address); and perhaps also a version where each thread has some data that's guaranteed to be in registers (e.g. with capped common size so that we can use a kat::array perhaps)
I wonder if we should consider a version of
append_to_global_memory()
where each thread may have its data elsewhere (at an address); and perhaps also a version where each thread has some data that's guaranteed to be in registers (e.g. with capped common size so that we can use akat::array
perhaps)