Closed paolodalberto closed 3 years ago
more information about the architecture
paolo@fastmmw:~/FastMM/Epyc/rocBLAS$ /opt/rocm/bin/rocminfo
ROCk module is loaded
Able to open /dev/kfd read-write
=====================
HSA System Attributes
=====================
Runtime Version: 1.1
System Timestamp Freq.: 1000.000000MHz
Sig. Max Wait Duration: 18446744073709551615 (0xFFFFFFFFFFFFFFFF) (timestamp count)
Machine Model: LARGE
System Endianness: LITTLE
==========
HSA Agents
==========
*******
Agent 1
*******
Name: AMD Ryzen Threadripper 1950X 16-Core Processor
Uuid: CPU-XX
Marketing Name: AMD Ryzen Threadripper 1950X 16-Core Processor
Vendor Name: CPU
Feature: None specified
Profile: FULL_PROFILE
Float Round Mode: NEAR
Max Queue Number: 0(0x0)
Queue Min Size: 0(0x0)
Queue Max Size: 0(0x0)
Queue Type: MULTI
Node: 0
Device Type: CPU
Cache Info:
L1: 32768(0x8000) KB
Chip ID: 0(0x0)
Cacheline Size: 64(0x40)
Max Clock Freq. (MHz): 3400
BDFID: 0
Internal Node ID: 0
Compute Unit: 16
SIMDs per CU: 0
Shader Engines: 0
Shader Arrs. per Eng.: 0
WatchPts on Addr. Ranges:1
Features: None
Pool Info:
Pool 1
Segment: GLOBAL; FLAGS: KERNARG, FINE GRAINED
Size: 65775172(0x3eba644) KB
Allocatable: TRUE
Alloc Granule: 4KB
Alloc Alignment: 4KB
Accessible by all: TRUE
Pool 2
Segment: GLOBAL; FLAGS: COARSE GRAINED
Size: 65775172(0x3eba644) KB
Allocatable: TRUE
Alloc Granule: 4KB
Alloc Alignment: 4KB
Accessible by all: TRUE
ISA Info:
N/A
*******
Agent 2
*******
Name: gfx803
Uuid: GPU-XX
Marketing Name: Ellesmere [Radeon Pro WX 7100]
Vendor Name: AMD
Feature: KERNEL_DISPATCH
Profile: BASE_PROFILE
Float Round Mode: NEAR
Max Queue Number: 128(0x80)
Queue Min Size: 4096(0x1000)
Queue Max Size: 131072(0x20000)
Queue Type: MULTI
Node: 1
Device Type: GPU
Cache Info:
L1: 16(0x10) KB
Chip ID: 26564(0x67c4)
Cacheline Size: 64(0x40)
Max Clock Freq. (MHz): 1243
BDFID: 17152
Internal Node ID: 1
Compute Unit: 36
SIMDs per CU: 4
Shader Engines: 4
Shader Arrs. per Eng.: 1
WatchPts on Addr. Ranges:4
Features: KERNEL_DISPATCH
Fast F16 Operation: FALSE
Wavefront Size: 64(0x40)
Workgroup Max Size: 1024(0x400)
Workgroup Max Size per Dimension:
x 1024(0x400)
y 1024(0x400)
z 1024(0x400)
Max Waves Per CU: 40(0x28)
Max Work-item Per CU: 2560(0xa00)
Grid Max Size: 4294967295(0xffffffff)
Grid Max Size per Dimension:
x 4294967295(0xffffffff)
y 4294967295(0xffffffff)
z 4294967295(0xffffffff)
Max fbarriers/Workgrp: 32
Pool Info:
Pool 1
Segment: GLOBAL; FLAGS: COARSE GRAINED
Size: 16777216(0x1000000) KB
Allocatable: TRUE
Alloc Granule: 4KB
Alloc Alignment: 4KB
Accessible by all: FALSE
Pool 2
Segment: GROUP
Size: 64(0x40) KB
Allocatable: FALSE
Alloc Granule: 0KB
Alloc Alignment: 0KB
Accessible by all: FALSE
ISA Info:
ISA 1
Name: amdgcn-amd-amdhsa--gfx803
Machine Models: HSA_MACHINE_MODEL_LARGE
Profiles: HSA_PROFILE_BASE
Default Rounding Mode: NEAR
Default Rounding Mode: NEAR
Fast f16: TRUE
Workgroup Max Size: 1024(0x400)
Workgroup Max Size per Dimension:
x 1024(0x400)
y 1024(0x400)
z 1024(0x400)
Grid Max Size: 4294967295(0xffffffff)
Grid Max Size per Dimension:
x 4294967295(0xffffffff)
y 4294967295(0xffffffff)
z 4294967295(0xffffffff)
FBarrier Max Size: 32
*******
Agent 3
*******
Name: gfx803
Uuid: GPU-XX
Marketing Name: Ellesmere [Radeon Pro WX 7100]
Vendor Name: AMD
Feature: KERNEL_DISPATCH
Profile: BASE_PROFILE
Float Round Mode: NEAR
Max Queue Number: 128(0x80)
Queue Min Size: 4096(0x1000)
Queue Max Size: 131072(0x20000)
Queue Type: MULTI
Node: 2
Device Type: GPU
Cache Info:
L1: 16(0x10) KB
Chip ID: 26564(0x67c4)
Cacheline Size: 64(0x40)
Max Clock Freq. (MHz): 1243
BDFID: 17408
Internal Node ID: 2
Compute Unit: 36
SIMDs per CU: 4
Shader Engines: 4
Shader Arrs. per Eng.: 1
WatchPts on Addr. Ranges:4
Features: KERNEL_DISPATCH
Fast F16 Operation: FALSE
Wavefront Size: 64(0x40)
Workgroup Max Size: 1024(0x400)
Workgroup Max Size per Dimension:
x 1024(0x400)
y 1024(0x400)
z 1024(0x400)
Max Waves Per CU: 40(0x28)
Max Work-item Per CU: 2560(0xa00)
Grid Max Size: 4294967295(0xffffffff)
Grid Max Size per Dimension:
x 4294967295(0xffffffff)
y 4294967295(0xffffffff)
z 4294967295(0xffffffff)
Max fbarriers/Workgrp: 32
Pool Info:
Pool 1
Segment: GLOBAL; FLAGS: COARSE GRAINED
Size: 16777216(0x1000000) KB
Allocatable: TRUE
Alloc Granule: 4KB
Alloc Alignment: 4KB
Accessible by all: FALSE
Pool 2
Segment: GROUP
Size: 64(0x40) KB
Allocatable: FALSE
Alloc Granule: 0KB
Alloc Alignment: 0KB
Accessible by all: FALSE
ISA Info:
ISA 1
Name: amdgcn-amd-amdhsa--gfx803
Machine Models: HSA_MACHINE_MODEL_LARGE
Profiles: HSA_PROFILE_BASE
Default Rounding Mode: NEAR
Default Rounding Mode: NEAR
Fast f16: TRUE
Workgroup Max Size: 1024(0x400)
Workgroup Max Size per Dimension:
x 1024(0x400)
y 1024(0x400)
z 1024(0x400)
Grid Max Size: 4294967295(0xffffffff)
Grid Max Size per Dimension:
x 4294967295(0xffffffff)
y 4294967295(0xffffffff)
z 4294967295(0xffffffff)
FBarrier Ma
```x Size: 32
*** Done ***
find /opt/rocm-3.10.0/ -name "omp.h" /opt/rocm-3.10.0/llvm/lib/clang/12.0.0/include/omp.h /opt/rocm-3.10.0/llvm/include/omp.h
however not found in find /opt/rocm-4.0.0/
/opt/rocm-4.0.0/llvm/ installed fresh ...
Thanks for reporting @paolodalberto. We will have to dig into the dependency issues but while you wait you should be able to directly install the llvm-clang openmp-extras package openmp-extras4.0.0_12.10-0_amd64.deb to get the omp.h put in the 4.0.0 tree, the deb is at http://repo.radeon.com/rocm/apt/4.0/pool/main/o/openmp-extras4.0.0/openmp-extras4.0.0_12.10-0_amd64.deb
Also @paolodalberto when you have time, you say "fresh" I you uninstalled 3.10 or on a clean machine? Did it require any special installation steps on your side?
One other thing to check is what the /opt/rocm
directory looks like. There now exist version-pinned rocm packages (e.g. rocm-dev4.0.0) and rolling version rocm packages (e.g. rocm-dev). However, I don't think you can mix-and-match them at the moment.
IIRC, the version-pinned packages don't make an /opt/rocm
symlink. That may be important because I notice that the PATH is to /opt/rocm/llvm/bin
rather than /opt/rocm-4.0.0/llvm/bin
.
It may or may not be of any help, but to check what packages I have installed from the rocm repositories, I use: aptitude search '~S ~i ~Orepo.radeon.com'
(or aptitude search '~S ~i ~Orepo.radeon.com' -F '%c %M %p %d %v'
to include the version numbers).
I had to install from source the 4.0.0 llvm package. This installs clang. The missing link is not about rocBLAS (I think) but the installation of the dependencies.
I did not uninstall 3.10, I usually keep all previous .... and I am not comfortable to deleting removing packages (if not automatically)
I update from ununtu 18 to 20 (cmake issues)
the /opt./rocm directory had kept most of the previous contents
this afternoon I will provide information about the /opt/rocm directory shape
Also @paolodalberto when you have time, you say "fresh" I you uninstalled 3.10 or on a clean machine? Did it require any special installation steps on your side?
I followed the instruction at because my version 4.0 did not have the llvm package https://github.com/ROCm-Developer-Tools/HIP/blob/master/INSTALL.md#hip-clang
One other thing to check is what the
/opt/rocm
directory looks like. There now exist version-pinned rocm packages (e.g. rocm-dev4.0.0) and rolling version rocm packages (e.g. rocm-dev). However, I don't think you can mix-and-match them at the moment.IIRC, the version-pinned packages don't make an
/opt/rocm
symlink. That may be important because I notice that the PATH is to/opt/rocm/llvm/bin
rather than/opt/rocm-4.0.0/llvm/bin
.
out.txt this is my tree for /opt/
paolo@fastmmw:~$ aptitude search '~S ~i ~Orepo.radeon.com'
i comgr - Library to provide support functions
i half - HALF-PRECISION FLOATING POINT LIBRARY
i A hip-base - HIP: Heterogenous-computing Interface for Portability [BASE]
i A hip-doc - HIP: Heterogenous-computing Interface for Portability [DOCUME
i hip-rocclr - HIP: Heterogenous-computing Interface for Portability [ROCClr
i A hip-samples - HIP: Heterogenous-computing Interface for Portability [SAMPLE
i A hsa-amd-aqlprofile - AQLPROFILE library for AMD HSA runtime API extension support
i A hsa-rocr-dev - AMD Heterogeneous System Architecture HSA - Linux HSA Runtime
i A hsakmt-roct - HSAKMT library for AMD KFD support
i A hsakmt-roct-dev - HSAKMT development package.
i llvm-amdgpu - amdgpu backend
i mivisionx - AMD MIVisionX toolkit is a comprehensive computer vision and
i A openmp-extras - OpenMP Extras provides openmp and flang libraries.
i rocblas - rocBLAS is AMD's library for BLAS on ROCm (Radeon Open Comput
i A rock-dkms - amdgpu driver in DKMS format.
i A rock-dkms-firmware - firmware blobs used by amdgpu driver in DKMS format
i A rocm-clang-ocl - OpenCL compilation with clang compiler.
i A rocm-cmake - rocm-cmake built using CMake
i A rocm-dbgapi - Library to provide AMD GPU debugger API
i rocm-dev - Radeon Open Compute (ROCm) Runtime software stack
i A rocm-device-libs - Radeon Open Compute - device libraries
i rocm-dkms - Radeon Open Compute (ROCm) Runtime software stack
i A rocm-gdb - ROCgdb
i rocm-opencl - OpenCL: Open Computing Language on ROCclr
i A rocm-opencl-dev - OpenCL: Open Computing Language on ROCclr
i A rocm-smi - System Management Interface for ROCm
i A rocm-smi-lib64 - AMD System Management libraries
i rocm-utils - Radeon Open Compute (ROCm) Runtime software stack
i A rocminfo - Radeon Open Compute (ROCm) Runtime rocminfo tool
i rocprim - Radeon Open Compute Parallel Primitives Library
i A rocprofiler-dev - ROCPROFILER library for AMD HSA runtime API extension support
i A roctracer-dev - AMD ROCTRACER library
Yes the 4.0 expects the openmp-extras 4.0 installed so you can install the deb I provided the link to. They will have to add the instructions for building openmp-extras from source to the HIP-Clang site you pointed to as we don't do that manually in rocBLAS.
The rocm-dkms 4.0 installation should provide all the llvm and openmp-extras as it did on my Ubuntu20 test.
let me reinstall the package openmp-extras
reinstalling the openmp-extras will be creating the correct includes. alas something else fails in compilation
[ 73%] Built target example-c-dgeam
[ 73%] Building CXX object clients/gtest/CMakeFiles/rocblas-test.dir/atomics_mode_gtest.cpp.o
[ 74%] Building CXX object clients/gtest/CMakeFiles/rocblas-test.dir/gemm_gtest.cpp.o
/home/paolo/FastMM/Epyc/rocBLAS/clients/gtest/multiheaded_gtest.cpp:219:5: error: unknown type name 'quick'
INSTANTIATE_TEST_CATEGORIES(multiheaded);
^
/home/paolo/FastMM/Epyc/rocBLAS/clients/gtest/../include/rocblas_test.hpp:154:42: note: expanded from macro 'INSTANTIATE_TEST_CATEGORIES'
INSTANTIATE_TEST_CATEGORY(testclass, quick) \
^
/home/paolo/FastMM/Epyc/rocBLAS/clients/gtest/multiheaded_gtest.cpp:219:5: error: parameter type '(anonymous namespace)::multiheaded' is an abstract class
/home/paolo/FastMM/Epyc/rocBLAS/clients/gtest/../include/rocblas_test.hpp:154:5: note: expanded from macro 'INSTANTIATE_TEST_CATEGORIES'
INSTANTIATE_TEST_CATEGORY(testclass, quick) \
^
/home/paolo/FastMM/Epyc/rocBLAS/clients/gtest/../include/rocblas_test.hpp:143:39: note: expanded from macro 'INSTANTIATE_TEST_CATEGORY'
testclass, \
^
/usr/local/include/gtest/gtest.h:484:16: note: unimplemented pure virtual method 'TestBody' in 'multiheaded'
virtual void TestBody() = 0;
^
/home/paolo/FastMM/Epyc/rocBLAS/clients/gtest/multiheaded_gtest.cpp:219:5: error: no type named 'ValuesIn' in namespace 'testing'
INSTANTIATE_TEST_CATEGORIES(multiheaded);
^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
/home/paolo/FastMM/Epyc/rocBLAS/clients/gtest/../include/rocblas_test.hpp:154:5: note: expanded from macro 'INSTANTIATE_TEST_CATEGORIES'
INSTANTIATE_TEST_CATEGORY(testclass, quick) \
^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
/home/paolo/FastMM/Epyc/rocBLAS/clients/gtest/../include/rocblas_test.hpp:144:39: note: expanded from macro 'INSTANTIATE_TEST_CATEGORY'
testing::ValuesIn(RocBLAS_TestData::begin([](const Arguments& arg) { \
~~~~~~~~~^
/home/paolo/FastMM/Epyc/rocBLAS/clients/gtest/multiheaded_gtest.cpp:219:5: error: C++ requires a type specifier for all declarations
Did you do an install.sh -dc to install the dependencies for clients. googletest must now be using their branch release-1.10.0 Other dependencies may also be required which are different than earlier releases. You may want to delete your build tree and then do the install -dc. I am just guessing based on what these errors suggest.
bash install.sh -c -a gfx803
The installation works ... if I do not specify the architecture it fails for gfx908.
however
clients/staging/./rocblas-bench -f gemm -r f32_r --transposeA N --transposeB N -m 4096 -n 4096 -k 4096 --alpha 1 --lda 4096 --ldb 4096 --beta 0 --ldc 4096 --device 1
Query device success: there are 2 devices
-------------------------------------------------------------------------------
Device ID 0 : Ellesmere [Radeon Pro WX 7100]
with 17.2 GB memory, max. SCLK 1243 MHz, max. MCLK 1750 MHz, compute capability 8.0
maxGridDimX 2147483647, sharedMemPerBlock 65.5 KB, maxThreadsPerBlock 1024, warpSize 64
-------------------------------------------------------------------------------
Device ID 1 : Ellesmere [Radeon Pro WX 7100]
with 17.2 GB memory, max. SCLK 1243 MHz, max. MCLK 1750 MHz, compute capability 8.0
maxGridDimX 2147483647, sharedMemPerBlock 65.5 KB, maxThreadsPerBlock 1024, warpSize 64
-------------------------------------------------------------------------------
/src/external/hip-on-vdi/rocclr/hip_fatbin.cpp:39: guarantee(false && "Cannot unmap file")
Aborted (core dumped)
I know I can install rocblas using apt but it is nice to understand the process ... rocBLAS will be used in combination with rocSPARSE. I can now use rocALUTION (rocBLAS installed by apt and rocSPARSE by script and rocRAND by partial script).
but it will be nice to have code optimized for the architecture.
You can see the architecture list used without -a in the root level CMakeLists.txt, and make sure you are building branch rocm-4.0.x if using 4.0 release hip and clang compilers. Sorry things are overly coupled right now as the compiler changes quickly.
As today I could build the library
bash install.sh -c -a gfx803
however
./rocblas-bench -f gemm -r f32_r --transposeA N --transposeB N -m 4096 -n 4096 -k 4096 --alpha 1 --lda 4096 --ldb 4096 --beta 0 --ldc 4096 --device 1
Query device success: there are 2 devices
-------------------------------------------------------------------------------
Device ID 0 : Ellesmere [Radeon Pro WX 7100]
with 17.2 GB memory, max. SCLK 1243 MHz, max. MCLK 1750 MHz, compute capability 8.0
maxGridDimX 2147483647, sharedMemPerBlock 65.5 KB, maxThreadsPerBlock 1024, warpSize 64
-------------------------------------------------------------------------------
Device ID 1 : Ellesmere [Radeon Pro WX 7100]
with 17.2 GB memory, max. SCLK 1243 MHz, max. MCLK 1750 MHz, compute capability 8.0
maxGridDimX 2147483647, sharedMemPerBlock 65.5 KB, maxThreadsPerBlock 1024, warpSize 64
-------------------------------------------------------------------------------
/src/external/hip-on-vdi/rocclr/hip_fatbin.cpp:39: guarantee(false && "Cannot unmap file")
Aborted (core dumped)
That is the type of error I would expect if the architecture was mismatched so can you confirm you are using a clean rocblas rocm-4.0.x branch code base. Did you do a install -d or -dc to get all the dependencies as I asked previously? Otherwise I would be looking for clues as to what is different, are there any warning messages during compile? If you don't include -a can you build for all architectures and gfx908 runtime fails or you meant the build fails (if so what error) ?
I can try again with -d .... git pull and bash install.sh -cd -a gfx803
The building is successful using the above install however the bechmark still does not collaborate
aolo@fastmmw:~/FastMM/Epyc/rocBLAS/build/release/clients/staging$ ./rocblas-bench -f gemm -r f32_r --transposeA N --transposeB N -m 4096 -n 4096 -k 4096 --alpha 1 --lda 4096 --ldb 4096 --beta 0 --ldc 4096 --device 1
Query device success: there are 2 devices
-------------------------------------------------------------------------------
Device ID 0 : Ellesmere [Radeon Pro WX 7100] gfx803
with 17.2 GB memory, max. SCLK 1243 MHz, max. MCLK 1750 MHz, compute capability 8.0
maxGridDimX 2147483647, sharedMemPerBlock 65.5 KB, maxThreadsPerBlock 1024, warpSize 64
-------------------------------------------------------------------------------
Device ID 1 : Ellesmere [Radeon Pro WX 7100] gfx803
with 17.2 GB memory, max. SCLK 1243 MHz, max. MCLK 1750 MHz, compute capability 8.0
maxGridDimX 2147483647, sharedMemPerBlock 65.5 KB, maxThreadsPerBlock 1024, warpSize 64
-------------------------------------------------------------------------------
/src/external/hip-on-vdi/rocclr/hip_fatbin.cpp:39: guarantee(false && "Cannot unmap file")
Aborted (core dumped)
I removed completely the build directory before the install ... clearly there is something off ... May be next time I upgrade the machine this will go away.
If you do not mind I will keep the issue open.
When debugging those sorts of 'guarantee' failures, I sometimes find that setting the AMD_LOG_LEVEL
environment variable to 3 or 4 can help clarify what was wrong. (The various environment variables listed in the system level debug documentation are also sometimes useful, though perhaps not applicable to this case.)
I will uninstall rocm and reinstall and redo
llvm-clang I cannot install using apt installl ...
llvm-clang I cannot install using apt installl ...
Aren't you installing rocm-dkms which will provide the llvm and clang ?
yep .. .cleaned up the /opt/rocm https://rocmdocs.amd.com/en/latest/Installation_Guide/Installation-Guide.html re-installing everything from scratch
llvm is not present in /opt/rocm otherwise
sudo apt install rocm-dkms
should provide the /opt/rocm/llvm as that is the version of clang/llvm you need to build rocblas with.
Were there error messages during that install?
nope
and today I cannot make llvm
let me autoremove everything and try again
this is painful
pdate-initramfs: Generating /boot/initrd.img-5.4.0-64-generic
Setting up rocm-smi (1.0.0-206-rocm-rel-4.0-23-ge39c0e2) ...
Setting up rocm-dbgapi (0.42.0.40000-23) ...
Setting up libelf-dev:amd64 (0.176-1.1build1) ...
Setting up rocm-opencl (3.6Beta-17-g875c1f8-rocm-rel-4.0-23) ...
Setting up hsakmt-roct-dev (20201016.1.0269-mainline-20201016-1-g0269ce3) ...
Setting up hip-doc (4.0.20496.5685.40000-23) ...
Setting up libtinfo5:amd64 (6.2-0ubuntu2) ...
Setting up rocm-opencl-dev (3.6Beta-17-g875c1f8-rocm-rel-4.0-23) ...
Setting up libncurses5:amd64 (6.2-0ubuntu2) ...
Setting up rocm-gdb (10.1-rocm-rel-4.0-23) ...
Setting up rocm-clang-ocl (0.5.0.64-rocm-rel-4.0-23-50fb51a) ...
Setting up rocm-utils (4.0.0.40000-23) ...
Setting up rocm-dev (4.0.0.40000-23) ...
Setting up rocm-dkms (4.0.0.40000-23) ...
Processing triggers for man-db (2.9.1-1) ...
Processing triggers for libc-bin (2.31-0ubuntu9.1) ...
root@fastmmw:/home/paolo# ls /opt/rocm
rocm/ rocm-4.0.0/
root@fastmmw:/home/paolo# ls /opt/rocm/
bin hip hsa-amd-aqlprofile include lib oam opencl rocm_smi rocprofiler roctracer share
root@fastmmw:/home/paolo# ls -lrt /opt/rocm/
total 44
drwxr-xr-x 4 root root 4096 Jan 21 11:28 hip
drwxr-xr-x 3 root root 4096 Jan 21 11:28 hsa-amd-aqlprofile
drwxr-xr-x 5 root root 4096 Jan 21 11:28 opencl
drwxr-xr-x 9 root root 4096 Jan 21 11:28 share
drwxr-xr-x 4 root root 4096 Jan 21 11:28 oam
drwxr-xr-x 6 root root 4096 Jan 21 11:28 rocm_smi
drwxr-xr-x 6 root root 4096 Jan 21 11:28 rocprofiler
drwxr-xr-x 2 root root 4096 Jan 21 11:28 bin
drwxr-xr-x 4 root root 4096 Jan 21 11:28 include
drwxr-xr-x 5 root root 4096 Jan 21 11:28 roctracer
drwxr-xr-x 3 root root 4096 Jan 21 11:30 lib
after installation now rebooting
/opt/rocm/bin/rocminfo
bash: /opt/rocm/bin/rocminfo: No such file or directory
paolo@fastmmw:~$ /opt/rocm/opencl/bin/clinfo
dlerror: libamd_comgr.so.1: cannot open shared object file: No such file or directory
dlerror: libamd_comgr.so.1: cannot open shared object file: No such file or directory
dlerror: libamd_comgr.so.1: cannot open shared object file: No such file or directory
dlerror: libamd_comgr.so.1: cannot open shared object file: No such file or directory
ERROR: clGetPlatformIDs(-1001)
I know that opencl was always a problem but rocinfo is not even installed
bash install.sh -cd -a gfx803
PREFIX=/opt/rocm /home/paolo/FastMM/Epyc/rocBLAS
CMake Error at /usr/share/cmake-3.16/Modules/CMakeDetermineCXXCompiler.cmake:48 (message):
Could not find compiler set in environment variable CXX:
hipcc.
Call Stack (most recent call first):
CMakeLists.txt:22 (project)
CMake Error: CMAKE_CXX_COMPILER not set, after EnableLanguage
building llvm and then hip
https://rocmdocs.amd.com/en/latest/Installation_Guide/HIP-Installation.html
the instructions are no good
I agree the instructions are weak as don't cover any potential problems, I would just go back to trying a clean install the rocm-dkms which contains all the hip and clang that we use to build rocBLAS as part of the install. As you just want to try and build rocBLAS correct? I just installed on a clean latest ubuntu 5.4.0-64 docker and see the /opt/rocm/llvm was installed so am guessing your package config or local packages are not all cleared out so messing up the install. Here are some instructions on force cleaning any old installation that might be causing trouble (drop sudo if you are root) if you are willing.
sudo apt-get autoremove rocm-opencl
check for /opt/rocm contents. There shouldn't be any files/folders present under it. If present, clean uninstall has not happened. try to clean uninstall using dpkg (sudo dpkg --purge rocm-dkms rock-dkms rock-dkms-firmware rocm-dev).
*If there is no content present, it implied clean uninstall .
sudo rm -rf /var/cache/apt/*
sudo apt-get clean all
sudo reboot
sudo dpkg --purge rocm-dev rocm-libs miopen-hip rocblas hipblas rocrand rocfft miopengemm comgr hip-base hip-doc hip-rocclr hip-samples hsa-amd-aqlprofile hsakmt-roct hsakmt-roct-devel hsa-rocr-dev llvm-amdgpu rock-dkms rock-dkms-firmware rocm-clang-ocl rocm-cmake rocm-dbgapi rocm-dev rocm-device-libs rocm-dkms rocm-gdb rocminfo rocm-opencl rocm-opencl-devel rocm-smi rocm-smi-lib64 rocm-utils rocprofiler-dev roctracer-dev
sudo dpkg -l | grep <
It should not list rock-dkms package. Similarly, applicable for all packages. There should not be any contents inside /opt/rocm
Now installation steps:
sudo rm -rf /var/cache/apt/*
sudo apt-get clean all
wget -q -O - https://repo.radeon.com/rocm/rocm.gpg.key | sudo apt-key add -
echo 'deb [arch=amd64] https://repo.radeon.com/rocm/apt/debian/ xenial main' | sudo tee /etc/apt/sources.list.d/rocm.list
sudo apt-get -y update
sudo apt-get -y install rocm-dkms
/opt/rocm/llvm should now exists after this has been installed, you could try to build the rocBLAS 4.0.x branch of install the rocblas from package. rocBLAS should then get the correct version of the toolchain.
I want to have a clean installation because I want to make sure the performance I will get are "complete" ... and I will used this in combination with rocALUTION ... it will be nice to have clear list of algorithms
let me follow your instructions ....
cleaning is important indeed ...
ls /opt/rocm-4.0.0/
amdgcn hip hsa-amd-aqlprofile lib oam rocm_smi roctracer
bin hsa include llvm opencl rocprofiler share
The C++ compiler
"/opt/rocm/bin/hipcc"
is not able to compile a simple test program.
It fails with the following output:
Change Dir: /home/paolo/FastMM/Epyc/rocBLAS/build/release/CMakeFiles/CMakeTmp
Run Build Command(s):/usr/bin/make cmTC_dcf9a/fast && /usr/bin/make -f CMakeFiles/cmTC_dcf9a.dir/build.make CMakeFiles/cmTC_dcf9a.dir/build
make[1]: Entering directory '/home/paolo/FastMM/Epyc/rocBLAS/build/release/CMakeFiles/CMakeTmp'
Building CXX object CMakeFiles/cmTC_dcf9a.dir/testCXXCompiler.cxx.o
/opt/rocm/bin/hipcc -o CMakeFiles/cmTC_dcf9a.dir/testCXXCompiler.cxx.o -c /home/paolo/FastMM/Epyc/rocBLAS/build/release/CMakeFiles/CMakeTmp/testCXXCompiler.cxx
Can't exec "/opt/rocm-4.0.0/llvm/bin/clang++": No such file or directory at /opt/rocm-4.0.0/hip/bin/hipconfig line 141.
Use of uninitialized value $HIP_CLANG_VERSION in pattern match (m//) at /opt/rocm-4.0.0/hip/bin/hipconfig line 142.
Use of uninitialized value $HIP_CLANG_VERSION in concatenation (.) or string at /opt/rocm-4.0.0/hip/bin/hipconfig line 145.
Can't exec "/opt/rocm-4.0.0/llvm/bin/clang++": No such file or directory at /opt/rocm-4.0.0/hip/bin/hipconfig line 141.
Use of uninitialized value $HIP_CLANG_VERSION in pattern match (m//) at /opt/rocm-4.0.0/hip/bin/hipconfig line 142.
Use of uninitialized value $HIP_CLANG_VERSION in concatenation (.) or string at /opt/rocm-4.0.0/hip/bin/hipconfig line 145.
Can't exec "/opt/rocm-4.0.0/llvm/bin/clang++": No such file or directory at /opt/rocm-4.0.0/hip/bin/hipconfig line 141.
Use of uninitialized value $HIP_CLANG_VERSION in pattern match (m//) at /opt/rocm-4.0.0/hip/bin/hipconfig line 142.
Use of uninitialized value $HIP_CLANG_VERSION in concatenation (.) or string at /opt/rocm-4.0.0/hip/bin/hipconfig line 145.
Can't exec "/opt/rocm-4.0.0/llvm/bin/clang++": No such file or directory at /opt/rocm-4.0.0/hip/bin/hipconfig line 141.
Use of uninitialized value $HIP_CLANG_VERSION in pattern match (m//) at /opt/rocm-4.0.0/hip/bin/hipconfig line 142.
Use of uninitialized value $HIP_CLANG_VERSION in concatenation (.) or string at /opt/rocm-4.0.0/hip/bin/hipconfig line 145.
Can't exec "/opt/rocm-4.0.0/llvm/bin/clang": No such file or directory at /opt/rocm/bin/hipcc line 203.
Use of uninitialized value $HIP_CLANG_VERSION in pattern match (m//) at /opt/rocm/bin/hipcc line 204.
Use of uninitialized value $HIP_CLANG_VERSION in concatenation (.) or string at /opt/rocm/bin/hipcc line 208.
Can't exec "/opt/rocm-4.0.0/llvm/bin/clang": No such file or directory at /opt/rocm/bin/hipcc line 895.
failed to execute: No such file or directory
make[1]: *** [CMakeFiles/cmTC_dcf9a.dir/build.make:66: CMakeFiles/cmTC_dcf9a.dir/testCXXCompiler.cxx.o] Error 255
make[1]: Leaving directory '/home/paolo/FastMM/Epyc/rocBLAS/build/release/CMakeFiles/CMakeTmp'
make: *** [Makefile:121: cmTC_dcf9a/fast] Error 2
CMake will not be able to correctly generate this project.
Call Stack (most recent call first):
CMakeLists.txt:22 (project)
-- Configuring incomplete, errors occurred!
See also "/home/paolo/FastMM/Epyc/rocBLAS/build/release/CMakeFiles/CMakeOutput.log".
See also "/home/paolo/FastMM/Epyc/rocBLAS/build/release/CMakeFiles/CMakeError.log".
+ check_exit_code 1
+ (( 1 != 0 ))
+ exit 1
ls -lrt /opt/rocm-4.0.0/llvm/bin/
total 6104
-rwxr-xr-x 1 root root 2178688 Dec 14 03:01 flang2
-rwxr-xr-x 1 root root 4069440 Dec 14 03:01 flang1
this mean that llvm installation is still incomplete ?
BTW: this is not a docker ...
Did it make the symlink /opt/rocm -> /opt/rocm-4.0.0 ?
You are having the worst luck I have seen, yes that llvm should have lots of files (109 clang etc.)
Those two flang files are correct... are you sure you aren't running out of disk space?
Sorry I only have a docker to spare right now but thought I did put 4.0.0 directly on a machine, I can ask around.
The link is there .... I rebuilt the llvm project from source and now trying to re-install rocBLAS
I should create a docker .. it is safer ... but at the same time it is yet another layer ...
What is the expected behavior
- pass compilation
What actually happens
- In file included from /home/paolo/FastMM/Epyc/rocBLAS/clients/common/blis_interface.cpp:5: /home/paolo/FastMM/Epyc/rocBLAS/build/deps/blis/include/blis/blis.h:18940:10: fatal error: 'omp.h' file not found
include // skipped
1 error generated when compiling for gfx803. make[2]: [clients/gtest/CMakeFiles/rocblas-test.dir/build.make:778: clients/gtest/CMakeFiles/rocblas-test.dir/__/common/blis_interface.cpp.o] Error 1 make[2]: Waiting for unfinished jobs.... /home/paolo/FastMM/Epyc/rocBLAS/clients/common/cblas_interface.cpp:7:10: fatal error: 'omp.h' file not found
include
1 error generated when compiling for gfx803. make[2]: *** [clients/gtest/CMakeFiles/rocblas-test.dir/build.make:765: clients/gtest/CMakeFiles/rocblas-test.dir/__/common/cblas_interface.cpp.o] Error 1
How to reproduce
- bash install.sh -c -a gfx803
Environment
paolo@fastmmw:~/FastMM/Epyc/rocBLAS$ /opt/rocm/bin/hipconfig --full HIP version : 4.0.20496-4f163c68
== hipconfig HIP_PATH : /opt/rocm-4.0.0/hip ROCM_PATH : /opt/rocm-4.0.0 HIP_COMPILER : clang HIP_PLATFORM : hcc HIP_RUNTIME : ROCclr CPP_CONFIG : -DHIP_PLATFORM_HCC= -I/opt/rocm-4.0.0/hip/include -I/opt/rocm-4.0.0/llvm/bin/../lib/clang/12.0.0 -I/opt/rocm-4.0.0/hsa/include -D__HIP_ROCclr__
== hip-clang HSA_PATH : /opt/rocm-4.0.0/hsa HIP_CLANG_PATH : /opt/rocm-4.0.0/llvm/bin clang version 12.0.0 (https://github.com/RadeonOpenCompute/llvm-project.git dac2bfceaa8d4a90257dc8a6d58f268e172ce00e) Target: x86_64-unknown-linux-gnu Thread model: posix InstalledDir: /opt/rocm-4.0.0/llvm/bin LLVM (http://llvm.org/): LLVM version 12.0.0git Optimized build with assertions. Default target: x86_64-unknown-linux-gnu Host CPU: znver1
Registered Targets: amdgcn - AMD GCN GPUs r600 - AMD GPUs HD2XXX-HD6XXX x86 - 32-bit X86: Pentium-Pro and above x86-64 - 64-bit X86: EM64T and AMD64 hip-clang-cxxflags : -DHIP_ROCclr -std=c++11 -isystem /opt/rocm-4.0.0/llvm/lib/clang/12.0.0/include/.. -isystem /opt/rocm-4.0.0/hsa/include -DHIP_ROCclr -isystem /opt/rocm-4.0.0/hip/include -O3 hip-clang-ldflags : -L/opt/rocm-4.0.0/hip/lib -O3 -lgcc_s -lgcc -lpthread -lm
=== Environment Variables PATH=/opt/rocm/llvm/bin:/home/paolo/.local/bin:/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin:/usr/games:/usr/local/games:/snap/bin HIP_DIR=/home/paolo/FastMM/Epyc/HIP
== Linux Kernel Hostname : fastmmw Linux fastmmw 5.4.0-60-generic #67-Ubuntu SMP Tue Jan 5 18:31:36 UTC 2021 x86_64 x86_64 x86_64 GNU/Linux No LSB modules are available. Distributor ID: Ubuntu Description: Ubuntu 20.04.1 LTS Release: 20.04
Let me know if this help .... it takes about 10-15 minutes to pass 4% compilation ...