ROCm / ROCm-Device-Libs

ROCm Device Libraries
97 stars 60 forks source link

can not build successfully #96

Closed code-fool closed 5 months ago

code-fool commented 1 year ago

image

sjnewbury commented 8 months ago

This happens because the toolchain target GPU doesn't doesn't support support the gws feature.

It needs something like the below patch to enable the code to be compiled regardless, or alternatively it could be #ifdef'd if unavailable.


--- ./ockl/src/cg.cl~   2023-10-17 21:55:28.000000000 +0100
+++ ./ockl/src/cg.cl    2024-01-23 22:27:55.950747059 +0000
@@ -85,17 +85,19 @@
     }
 }

+#pragma clang attribute push (__attribute__((target("gws"))), apply_to=function)
 void
 __ockl_gws_init(uint nwm1, uint rid)
 {
     __builtin_amdgcn_ds_gws_init(nwm1, rid);
 }

 void
 __ockl_gws_barrier(uint nwm1, uint rid)
 {
     __builtin_amdgcn_ds_gws_barrier(nwm1, rid);
 }
+#pragma clang attribute pop

 __attribute__((const)) int
 __ockl_grid_is_valid(void)
b-sumner commented 8 months ago

The necessary attribute was added to the device library when needed. See https://github.com/ROCm/llvm-project/blob/amd-staging/amd/device-libs/ockl/src/cg.cl#L87 . I think the issue here is that the device library is not consistent with the compiler. That is one reason why we moved the device library under the llvm project.

sjnewbury commented 8 months ago

@b-sumner On Gentoo we use the normal upstream LLVM, not the AMD fork. No doubt that's what causes this issue for me. Should we/I be packaging the device library from the llvm fork instead of here?

b-sumner commented 8 months ago

@sjnewbury since the AMD fork's amd-staging branch is as close to upstream main as we can get, it's likely that tip device libraries amd-staging will be OK with tip LLVM. However, no guarantees. There are sometimes fixes or whatever that don't get upstream first. In such cases you can probably pull back some of the latest device libs changes to match.

lamb-j commented 5 months ago

Now that we've moved device libs to https://github.com/ROCm/llvm-project/tree/amd-staging/amd/device-libs, we hope to have fewer problems LLVM/device-libs mismatching.

Also, we're doing more testing of device libs and comgr against upstream LLVM recently (although building this way is still not officially supported or guaranteed to work).

Closing this issue for now. Can you open a new issue at https://github.com/ROCm/llvm-project if you're still having problems? Thanks!