Please help to answer the two questions about this, thank you

stotko / stdgpu

stdgpu: Efficient STL-like Data Structures on the GPU

https://stotko.github.io/stdgpu/

Apache License 2.0

1.16k stars 83 forks source link

Please help to answer the two questions about this, thank you #413

Closed wangshuxihe00 closed 7 months ago

wangshuxihe00 commented 7 months ago

1.When I want to use other containers (e.g., list, array), do I need to implement one myself? 2.The hip version of the example does not have the use of container classes, is it supported?

stotko commented 7 months ago

1.When I want to use other containers (e.g., list, array), do I need to implement one myself?

Some containers are not implemented because their design is highly non-trivial to correctly implement for parallel GPU execution. For instance, map/set, i.e. the ordered versions, implicitly require sorting and maintaining that sorted state across all threads. array on the other hand should be quite straightforward to implement as its size is fixed at compile time.

2.The hip version of the example does not have the use of container classes, is it supported?

HIP support is still considered experimental in general, but should work fine as the respective backend code is very similar to the CUDA one. During streamlining the HIP backend in the past, I updated the tests but simply forgot to add the respective examples.

wangshuxihe00 commented 7 months ago

Are you saying that ordered versions of maps/sets cannot maintain their characteristics when executed in parallel on gpu, so they are difficult to implement?

stotko commented 7 months ago

Not exactly. Implementing the characteristic that a map/set remains sorted within a kernel after a thread inserted/erased a value - while at the same time other threads are attempting to also do so with different values - is challenging, especially as these operations should be as fast as possible. Here, a lot of things have to be considered in terms of synchronization. I believe that this is definitely possible. But given that the unordered version is often sufficient and also faster for many applications, the ordered version received less attention and has not been implemented yet.

wangshuxihe00 commented 7 months ago

I see. Thank you

stotko commented 7 months ago

Closing this issue as the questions have been answered. If there are further issues, feel free to ask again.