ROCm / ROCR-Runtime

ROCm Platform Runtime: ROCr a HPC market enhanced HSA based runtime
https://rocm.docs.amd.com/projects/ROCR-Runtime/en/latest/
Other
224 stars 109 forks source link

[Feature]: add a `hsa_amd_signal_wait_all` API. #241

Open benvanik opened 1 month ago

benvanik commented 1 month ago

Suggestion Description

hsa_amd_signal_wait_any exists and routes to hsaKmtWaitOnMultipleEvents_Ext and is useful for implementing host-side barrier-or packet behavior. There's currently not a hsa_amd_signal_wait_all, though, which is needed to efficiently implement the barrier-and packet behavior. hsaKmtWaitOnMultipleEvents_Ext has a WaitOnAll flag and it'd be useful to have a top-level API that routes to that.

Operating System

No response

GPU

No response

ROCm Component

No response

atgutier commented 1 month ago

Will take a look at this.

atgutier commented 1 month ago

Can you test #250 or point me to any tests you may have for this when you get a chance?

benvanik commented 3 weeks ago

(I don't have anything running yet but can try to make a test for this - thanks for implementing it :)

benvanik commented 3 weeks ago

I haven't tested it yet but one quirk of hsa_amd_signal_wait_any is that it does not allow 0/NULL signal handles. This is inconsistent with the AQL hsa_barrier_or_packet_t that allows any signal to have a 0/NULL value to have it be ignored (effectively). When implementing soft queues that support the barrier packets it'd be nice to be able to pass the dependency signals directly to the APIs without needing to filter them at the application level. The hsa_barrier_and_packet_t treats 0/NULL as having a 0 value (so effectively ignored). This may be worth doing as a separate thing so I'll file a new issue for it (I think your PR #250 is consistent with hsa_amd_signal_wait_any's behavior by checking the validity of signal handles).

atgutier commented 3 weeks ago

I haven't tested it yet but one quirk of hsa_amd_signal_wait_any is that it does not allow 0/NULL signal handles. This is inconsistent with the AQL hsa_barrier_or_packet_t that allows any signal to have a 0/NULL value to have it be ignored (effectively). When implementing soft queues that support the barrier packets it'd be nice to be able to pass the dependency signals directly to the APIs without needing to filter them at the application level. The hsa_barrier_and_packet_t treats 0/NULL as having a 0 value (so effectively ignored). This may be worth doing as a separate thing so I'll file a new issue for it (I think your PR #250 is consistent with hsa_amd_signal_wait_any's behavior by checking the validity of signal handles).

Sounds good. I think that 0/NULL semantics matching the barrier should be doable.

benvanik commented 3 weeks ago

Cool! Filed at #252!