Closed wangshuxihe00 closed 7 months ago
1.When I want to use other containers (e.g., list, array), do I need to implement one myself?
Some containers are not implemented because their design is highly non-trivial to correctly implement for parallel GPU execution. For instance, map
/set
, i.e. the ordered versions, implicitly require sorting and maintaining that sorted state across all threads. array
on the other hand should be quite straightforward to implement as its size is fixed at compile time.
2.The hip version of the example does not have the use of container classes, is it supported?
HIP support is still considered experimental in general, but should work fine as the respective backend code is very similar to the CUDA one. During streamlining the HIP backend in the past, I updated the tests but simply forgot to add the respective examples.
Are you saying that ordered versions of maps/sets cannot maintain their characteristics when executed in parallel on gpu, so they are difficult to implement?
Not exactly. Implementing the characteristic that a map/set remains sorted within a kernel after a thread inserted/erased a value - while at the same time other threads are attempting to also do so with different values - is challenging, especially as these operations should be as fast as possible. Here, a lot of things have to be considered in terms of synchronization. I believe that this is definitely possible. But given that the unordered version is often sufficient and also faster for many applications, the ordered version received less attention and has not been implemented yet.
I see. Thank you
Closing this issue as the questions have been answered. If there are further issues, feel free to ask again.
1.When I want to use other containers (e.g., list, array), do I need to implement one myself? 2.The hip version of the example does not have the use of container classes, is it supported?