Closed mbuergle closed 3 years ago
You can mix and match certain generated kernels, while using the GPU backend, it should work (we often debug things this way). So if you generated sequential and CUDA kernels, you can compile and use them together. You can't mix OpenMP and CUDA though, due to op_plan creation.
Unfortunately, we encounter duplicate symbols when combining CUDA and sequential kernels, since OP2-CUDA requires the file op_cuda_rt_support.c
and OP2-SEQ requires op_dummy_singlenode.c
. Both of these file define int op_mpi_halo_exchanges(op_set set, int nargs, op_arg *args)
. Is this avoidable?
OP2-SEQ should not require op_dummy_singlenode.c when op_cuda_rt_support.c is present - all the necessary functions should be there in op_cuda_rt_support.c.
On 2021. Sep 9., at 7:49, mbuergle @.***> wrote:
Unfortunately, we encounter duplicate symbols when combining CUDA and sequential kernels, since OP2-CUDA requires the file op_cuda_rt_support.c and OP2-SEQ requires op_dummy_singlenode.c. Both of these file define int op_mpi_halo_exchanges(op_set set, int nargs, op_arg *args). Is this avoidable?
— You are receiving this because you commented. Reply to this email directly, view it on GitHub https://github.com/OP-DSL/OP2-Common/issues/216#issuecomment-915782761, or unsubscribe https://github.com/notifications/unsubscribe-auth/AAJWVVK7TA7KZAZWEDER2WLUBBDE3ANCNFSM5DUTSBHA. Triage notifications on the go with GitHub Mobile for iOS https://apps.apple.com/app/apple-store/id1477376905?ct=notification-email&mt=8&pt=524675 or Android https://play.google.com/store/apps/details?id=com.github.android&referrer=utm_campaign%3Dnotification-email%26utm_medium%3Demail%26utm_source%3Dgithub.
@reguly Thank you for your support! Which set of files have you been compiling and linking in your tests?
Unfortunately, the "sequential" kernels appear to require op_mpi_test_all
(see e.g. the airfoil example or the corresponding generator source), which is defined only in op_dummy_singlenode.c
and op_mpi_core.c
.
Sorry about that, this is a newly introduced bug - op_cuda_rt_support.c should have included this method. I just pushed a fix to the master branch.
On 2021. Sep 9., at 12:01, m-8k @.***> wrote:
@reguly https://github.com/reguly Thank you for your support! Which set of files have you been compiling and linking in your tests?
Unfortunately, the "sequential" kernels appear to require op_mpi_test_all (see e.g. the airfoil example https://github.com/OP-DSL/OP2-Common/blob/7366bdcb35ab7077f2e1428a76ca5a1bdf7e614b/apps/c/airfoil/airfoil_hdf5/dp/seq/adt_calc_seqkernel.cpp#L42 or the corresponding generator source https://github.com/OP-DSL/OP2-Common/blob/04d8f2c9c66ebd86145247101de706be7dcc897c/translator/c/python/op2_gen_seq.py#L248), which is defined only in op_dummy_singlenode.c and op_mpi_core.c.
— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/OP-DSL/OP2-Common/issues/216#issuecomment-915943045, or unsubscribe https://github.com/notifications/unsubscribe-auth/AAJWVVM63SLIMXZYQMDER4DUBCAWBANCNFSM5DUTSBHA. Triage notifications on the go with GitHub Mobile for iOS https://apps.apple.com/app/apple-store/id1477376905?ct=notification-email&mt=8&pt=524675 or Android https://play.google.com/store/apps/details?id=com.github.android&referrer=utm_campaign%3Dnotification-email%26utm_medium%3Demail%26utm_source%3Dgithub.
Thank you @reguly ! - Your fix solved the issue with the "sequential" kernels.
Does this resolve the overall issue as well?
Of course it would be great if it were possible to combine OpenMP and CUDA kernels, too, but I understand from your reply above that this is not supported at the moment.
Could this use case be supported in the future or is it out of scope for now?
For lack of a relevant use case, it has not been on the TODO list before, since we only used this “mixing” of kernels for debugging purposes. What would be your use case for this?
On 2021. Sep 9., at 17:02, m-8k @.***> wrote:
Of course it would be great if it were possible to combine OpenMP and CUDA kernels, too, but I understand from your reply above that this is not supported at the moment.
Could this use case be supported in the future or is it out of scope for now?
— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/OP-DSL/OP2-Common/issues/216#issuecomment-916182073, or unsubscribe https://github.com/notifications/unsubscribe-auth/AAJWVVO37EHB3YWH36QIK7TUBDEBVANCNFSM5DUTSBHA. Triage notifications on the go with GitHub Mobile for iOS https://apps.apple.com/app/apple-store/id1477376905?ct=notification-email&mt=8&pt=524675 or Android https://play.google.com/store/apps/details?id=com.github.android&referrer=utm_campaign%3Dnotification-email%26utm_medium%3Demail%26utm_source%3Dgithub.
The idea was to improve performance by scheduling some multi-threaded kernels on the CPU, but we'll benchmark first to determine whether this indeed looks promising in our use case.
Thank you very much for the support, I think you can close this issue.
Dear OP2 team, we were wondering if you intend to support mixed executables by combining CUDA and non-CUDA backends, i.e. that certain kernels could run on the CPU and others on GPU?