OP-DSL / OP2-Common

OP2: open-source framework for the execution of unstructured grid applications on clusters of GPUs or multi-core CPUs
https://op-dsl.github.io
Other
98 stars 46 forks source link

Fixed syncing gather kernels when using gpudirect #247

Closed TobyFlynn closed 1 year ago

TobyFlynn commented 1 year ago

My previous fix for syncing the gather kernels before the gpudirect MPI halo exchange call ends up waiting for the op2 kernel to finish executing over the core set. This pull request fixes this so that the halo exchange only waits on the gather kernels.

TobyFlynn commented 1 year ago

@reguly I've been trying to launch the gather kernels in a separate stream like the scatter kernels currently are but I'm running into some issues. I'll revisit it next week but in the meantime I'll merge this pull request.