Open junliume opened 1 year ago
@atamazov could you also take a look at this issue? Thanks!
BTW~ thanks to @illsilin the current working combination is:
ROCmSoftwarePlatform/composable_kernel@c1370ef3f1cea8066f4f3b88399ccb36b22d95ae -DCMAKE_CXX_FLAGS=" --offload-arch=gfx900 --offload-arch=gfx906 --offload-arch=gfx908 --offload-arch=gfx90a --offload-arch=gfx1030 --offload-arch=gfx1100 --offload-arch=gfx1101 --offload-arch=gfx1102 "
and it will generate:
root@ixt-sjc2-60:~/MIOpen/build# roc-obj-ls /opt/rocm/lib/libMIOpen.so
1 host-x86_64-unknown-linux file:///opt/rocm/lib/libMIOpen.so#offset=319401984&size=0
1 hipv4-amdgcn-amd-amdhsa--gfx1030 file:///opt/rocm/lib/libMIOpen.so#offset=319401984&size=749312
1 hipv4-amdgcn-amd-amdhsa--gfx1100 file:///opt/rocm/lib/libMIOpen.so#offset=320151552&size=749312
1 hipv4-amdgcn-amd-amdhsa--gfx1101 file:///opt/rocm/lib/libMIOpen.so#offset=320901120&size=749312
1 hipv4-amdgcn-amd-amdhsa--gfx1102 file:///opt/rocm/lib/libMIOpen.so#offset=321650688&size=749312
1 hipv4-amdgcn-amd-amdhsa--gfx900 file:///opt/rocm/lib/libMIOpen.so#offset=322400256&size=749312
1 hipv4-amdgcn-amd-amdhsa--gfx906 file:///opt/rocm/lib/libMIOpen.so#offset=323149824&size=749312
1 hipv4-amdgcn-amd-amdhsa--gfx908 file:///opt/rocm/lib/libMIOpen.so#offset=323899392&size=1326848
1 hipv4-amdgcn-amd-amdhsa--gfx90a file:///opt/rocm/lib/libMIOpen.so#offset=325226496&size=1281792
I think our ultimate goal should still separate libRocComposer.so from libMIOpen.so :)
@junliume
if libMIOpen.so contains any "hipv4-amdgcn-amd-amdhsa--gfx**", e.g. hipv4-amdgcn-amd-amdhsa--gfx1030, then runtime will check and initialize in ALL GPU context, and in this case, fail in gfx1101 case.
General question: Can HIP applications be written and run on systems containing one or NONE of the supported GPUs? For example, is it possible to write a portable HIP application that will use gfx906 (if available) or run only on the CPU (if gfx906 is not available)? Is this functionality supported?
If this is generally supported, then running the library on GFX11 systems should work without any modifications (as if there is no GPU). Some fixes in the HIP compiler and/or runtime are required.
Otherwise let's think about W/A or solution in MIOpen.
Do you agree?
Is this still a valid issue, or can we close this?
[Problem Description] Current libMIOpen.so:
The hipv4-amdgcn-amd-amdhsa--gfx9** is from Composable Kernel which is a MIOpen dependency, I did an experiment:
by building CK with -DGPU_TARGETS="gfx900;gfx906;gfx908;gfx90a;gfx1030", we have:
By building CK with -DGPU_TARGETS="gfx908;gfx90a", we have:
[Proposal]: The platform-dependent ISA is introduced into libMIOpen.so by Composable Kernel. libMIOpen.so was platform-independent so runtime does not check it if any platform dependent ISA exists in the lib, runtime will check and try to initialize that lib in GPU context. So in this case:
If libMIOpen.so does not contain any "hipv4-amdgcn-amd-amdhsa--gfx", then runtime will NOT check and things go well if libMIOpen.so contains any "hipv4-amdgcn-amd-amdhsa--gfx", e.g. hipv4-amdgcn-amd-amdhsa--gfx1030, then runtime will check and initialize in ALL GPU context, and in this case, fail in gfx1101 case.
[Short Term W/A]*: (now) When PT is building from MIOpen on gfx110x, please change
requirements.txt
in MIOpen from "ROCmSoftwarePlatform/composable_kernel@eef009d001b928db1bb377a105c93b75e0dccc7b -DGPU_TARGETS="gfx900;gfx906;gfx908;gfx90a;gfx1030"" to "ROCmSoftwarePlatform/composable_kernel@eef009d001b928db1bb377a105c93b75e0dccc7b -DGPU_TARGETS="""[Medium Term W/A]*: (ROCm 5.6) Add support of "gfx1100;gfx1101,gfx1102;gfx980" in CK GPU_TARGETS
[Medium Term Solution]*: (preferably ROCm 5.6) In addition to the Medium Term W/A, MIOpen should separate libCK.so (or libComposer.so) from libMIOpen.so, keep libMIOpen.so platform independent as it should be*.
[Other issues or By-Product] According to @illsilin , the following works in
requirements.txt
:However,
does not work properly with the latest compiler. This should be tracked by another issue.