OP-DSL / OP2-Common

OP2: open-source framework for the execution of unstructured grid applications on clusters of GPUs or multi-core CPUs
https://op-dsl.github.io
Other
99 stars 47 forks source link

How to deal with boundary condition in CUDA OP2? #225

Open lj-cug opened 2 years ago

lj-cug commented 2 years ago

Dear Sir: Another question: How should I deal with the boundary condition in OP2-CUDA? The branch divergence problem always exists in cuda code when there's many if-conditions, especially for hydrodynamic simulation cases. Thanks Li Jian

reguly commented 2 years ago

Hello,

There is no ideal way of doing this. You can either include if conditions in a kernel that includes the boundary, and check based on the index passed in by op_arg_idx, or you can launch separate ops_par_loops for the boundary (see e.g. update_halo.cpp in apps/c/CloverLeaf). Which one will perform better very much depends on your application.But if you want to easily switch between different boundary conditions, I suggest going with separate ops_par_loops (even though they might end up being slightly slower).

Best, Istvan

On 2022. Jan 28., at 13:06, lj-cug @.***> wrote:

Dear Sir: Another question: How should I deal with the boundary condition in OP2-CUDA? The branch divergence problem always exists in cuda code when there's many if-conditions, especially for hydrodynamic simulation cases. Thanks Li Jian

— Reply to this email directly, view it on GitHub https://github.com/OP-DSL/OP2-Common/issues/225, or unsubscribe https://github.com/notifications/unsubscribe-auth/AAJWVVN5RRAIJX64O4MDPBTUYKBDDANCNFSM5NASNFLA. Triage notifications on the go with GitHub Mobile for iOS https://apps.apple.com/app/apple-store/id1477376905?ct=notification-email&mt=8&pt=524675 or Android https://play.google.com/store/apps/details?id=com.github.android&referrer=utm_campaign%3Dnotification-email%26utm_medium%3Demail%26utm_source%3Dgithub. You are receiving this because you are subscribed to this thread.

reguly commented 2 years ago

Let me correct that (I mixed up OPS and OP2 here). For OP2, you can create sets which only include the boundary elements, and then do an op_par_loop only over those. Or you can create a dataset which flags which elements are on the boundary, and do the if conditions inside the kernel for an op_par_loop over the entire domain.